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MATRIX-REMODELING GENES 
TECHNICAL FIELD 

5 The invention relates to novel matrix-remodeling genes identified by their 

coexpression with known matrix-remodeling genes. The invention also relates to the use of 
these biomolecules in diagnosis, prognosis, prevention, treatment, and evaluation of therapies 
for diseases, particularly diseases associated with matrix-remodeling such as cancer, 
cardiomyopathy, arthritis, angiogenesis, diabetic necrosis, atherosclerosis, fibrosis, and 

10 ulceration. 

BACKGROUND OF THE INVENTION 

Matrix remodeling is associated with the construction, destruction, and reorganization 
of extracellular matrix components and is essential in normal cellular functions and also in 
many disease processes. These disease processes include metastatic cancer, cardiomyopathy, 

1 5 arthritis, angiogenesis, diabetic necrosis, atherosclerosis, fibrosis, and ulceration (Alexander 
and Werb (1991) In: Cell Biology of Extracellular Matrix. Plenum Press, New York NY, pp. 
255-302; Schuppan et al. (1993) In: Extracellular Matrix . Marcel Dekker, New York NY, pp. 
201-254; Zvibel and Kraft (1993) In: Extracellular Matrix . Marcel Dekker, New York NY, pp. 
559-580; Shanahan et al. (1994) J Clin Invest 93:2393-402; Kielty and Shuttleworth (1995) Int 

20 J Biochem Cell Biol 27:747-60; Bitar and Labbad (1996) J Surg Res 61:113-9; Dourado et al. 
(1996) Osteoarthritis Cartilage 4:187-96; Grant et al. (1996) Regul. Pept. 67:137-44; Gunja- 
Smith et al. (1996) Am J Pathol 148:1639-48; Alcolado et al. (1997) Clin. Sci 92:103-12; Cs- 
Szabo et al. (1997) Arthritis Rheum 40:1037-45; Hayward and Brock (1997) Hum Mutat 
10:415-23; Ledda et al. (1997) J Invest Dermatol 108:210-4; Hayashido et al. (1998) Int J 

25 Cancer 75:654-8; Ito et al. (1998) Kidney Int 53:853-61; Nelson et al. (1998) Cancer Res 
58:232-6). 

Many genes that participate in and regulate matrix remodeling are known, but many 
remain to be identified. Identification of currently unknown genes will provide new 
diagnostic and therapeutic targets. In addition, these genes will provide new opportunities for 
30 therapeutic tissue engineering-the use of drugs or biologicals to direct the creation of new 
tissues such as skin, pancreas, or liver that can replace tissues lost to disease or trauma. 

l 
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The present invention provides new compositions that are useful for diagnosis, 
prognosis, treatment, prevention, and evaluation of therapies for diseases associated with 
matrix remodeling. We have implemented a method for analyzing gene expression patterns 
and have identified 20 novel matrix-remodeling genes by their coexpression with known 
5 matrix-remodeling genes. 

SUMMARY OF THE INVENTION 

In one aspect, the invention provides for a substantially purified polynucleotide 
comprising a gene that is coexpressed with one or more known matrix-remodeling genes in a 
plurality of biological samples. Preferably, each known matrix-remodeling gene is selected 

10 from the group consisting of osteonectin (BM-40), chondroitin/dermatan sulfate proteoglycans 
(C/DSPG), collagen I, II, II, and IV, connective tissue growth factor (CTGF), fibrillin, 
fibronectins, fibronectin receptor (fibr-r), fibulin 1 , heparan sulfate proteoglycans (HSPG), 
extracellular matrix protein (hevin), insulin-like growth factor 1 (IGF 1), insulin-like growth 
factor binding protein (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix 

1 5 metalloproteases (MMP), and tissue inhibitors of matrix metalloproteinase 1 , 2, and 3 (TIMP 
1, 2, and 3). Preferred embodiments are (a) a polynucleotide sequence selected from the 
group consisting of SEQ ID NOs:l-20; (b) a polynucleotide sequence which encodes a 
polypeptide sequence of SEQ ID NOs:21, 22, and 23; (c) a polynucleotide sequence having 
at least 70% identity to the polynucleotide sequence of (a) or (b); (d) a polynucleotide 

20 sequence comprising at least 18 sequential nucleotides of the polynucleotide sequence of (a), 
(b), or (c); (e) a polynucleotide which hybridizes under stringent conditions to the 
■ polynucleotide of (a), (b), (c), or (d); or (f) a polynucleotide sequence which is 
complementary to the polynucleotide sequence of (a), (b), (c), (d) or (e). Furthermore, the 
invention provides an expression vector comprising any of the above described 

25 polynucleotides and host cells comprising the expression vector. Still further, the invention 
provides a method for treating or preventing a disease or condition associated with the altered 
expression of a gene that is coexpressed with one or more known matrix-remodeling genes in 
a sample comprising administering to a subject in need the above-described polynucleotides in 
an amount effective for treating or preventing said disease. 

30 In a second aspect, the invention provides a substantially purified polypeptide 

comprising the gene product of a gene that is coexpressed with one or more known matrix- 
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remodeling genes in a plurality of biological samples. The known matrix-remodeling gene 
may be selected from the group consisting of osteonectin (BM-40), chondroitin/dermatan 
sulfate proteoglycans (C/DSPG), collagen I, II, II, and IV, connective tissue growth factor 
(CTGF), fibrillin, fibronectins, fribonectin receptors (fibr-r), fibulin 1, heparan sulfate 
5 proteoglycans (HSPG), extracellular matrix protein (hevin), insulin-like growth factor 1 (IGF 
1), insulin-like growth factor binding protein (IGFBP), laminin, lumican, matrix Gla protein 
(MGP), matrix metalloproteases (MMP), and tissue inhibitors of matrix metalloproteinase 1, 
2, and 3 (TIMP 1 , 2, and 3). Preferred embodiments are polypeptides comprising (a) the 
polypeptide sequence of SEQ ID NO:21, 22, or 23; (b) a polypeptide sequence having at least 

10 85% identity to the polypeptide sequence of (a); and (c) a polypeptide sequence comprising at 
least 6 sequential amino acids of the polypeptide sequence of (a) or (b). Additionally, the 
invention provides antibodies that bind specifically to any of the above described polypeptides 
and a method for treating or preventing a disease or condition associated with the altered 
expression of a gene that is coexpressed with one or more known matrix-remodeling genes in 

1 5 a sample comprising administering to a subject in need such an antibody in an amount 
effective for treating or preventing said disease. 

In another aspect, the invention provides a pharmaceutical composition comprising 
the polynucleotide of claim 2 or the polypeptide of claim 3 in conjunction with a suitable 
pharmaceutical carrier or a method for treating or preventing a disease or condition associated 

20 with the altered expression of a gene that is coexpressed with one or more known matrix- 
remodeling genes in a sample comprising administering to a subject in need such 
compositioning in an amount effective for treating or preventing said disease. 

In yet a further aspect, the invention provides a method for diagnosing a disease or 
condition associated with the altered expression of a gene that is coexpressed with one or more 

25 known matrix-remodeling genes in a sample, wherein each known matrix-remodeling gene is 
selected from the group consisting of osteonectin (BM-40), chondroitin/dermatan sulfate 
proteoglycans (C/DSPG), collagen I, II, II, and IV, connective tissue growth factor (CTGF), 
fibrillin, fibronectins, fibronectin receptor (fibr-r), fibulin 1 , heparan sulfate proteoglycans 
(HSPG), extracellular matrix protein (hevin), insulin-like growth factor 1 (IGF 1), insulin-like 

30 growth factor binding protein (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix 
metalloproteases (MMP), and tissue inhibitors of matrix metalloproteinase 1 , 2, and 3 (TIMP 
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1 , 2, and 3). The method comprises the steps of (a) providing the sample comprising one of 
more of said coexpressed genes; (b) hybridizing the polynucleotide of the coexpressed genes 
under conditions effective to form one or more hybridization complexes; and (c) detecting the 
hybridization complexes, wherein the altered level of one or more of the hybridization 
5 complexes in a diseased sample compared with the level of hybridization complexes in a non- 
diseased sample correlates with the presence of the disease or condition in the sample. 
BRIEF DESCRIPTION OF THE SEQUENCE LISTING 
The Sequence Listing provides exemplary matrix-remodeling-associated sequences 
including polynucleotide sequences, SEQ ID NOs:l-20, and polypeptide sequences, SEQ ID 
10 NOs:21-23. Each sequence is identified by a sequence identification number (SEQ ID NO) 
and by the Incyte Clone number from which the sequence was first identified. 

DESCRIPTION OF THE INVENTION 
It must be noted that as used herein and in the appended claims, the singular forms "a," 
"an," and "the" include the plural reference unless the context clearly dictates otherwise. 
15 Thus, for example, a reference to "a host cell" includes a plurality of such host cells, and a 
reference to "an antibody" is a reference to one or more antibodies and equivalents thereof 
known to those skilled in the art, and so forth. 
DEFINITIONS 

"NSEQ" refers generally to a polynucleotide sequence of the present invention, 
20 including SEQ ID NOs: 1-20. "PSEQ" refers generally to a polypeptide sequence of the 
present invention, including SEQ ID NOs:21-23. 

A " variant" refers to either a polynucleotide or a polypeptide whose sequence diverges 
from SEQ ID NOs: 1-20 or SEQ ID NOs:2 1 -23, respectively. Polynucleotide sequence 
divergence may result from mutational changes such as deletions, additions, and substitutions 
25 of one or more nucleotides; it may also occur because of differences in codon usage. Each of 
these types of changes may occur alone, or in combination, one or more times in a given 
sequence. Polypeptide variants include sequences that possess at least one structural or 
functional characteristic of SEQ ID NOs:21-23. 

A "fragment" can refer to a nucleic acid sequence that is preferably at least 20 nucleic 
30 acids in length, more preferably 40 nucleic acids, and most preferably 60 nucleic acids in 
length, and encompasses, for example, fragments consisting of nucleic acids 1-50 or 200-500 



4 



WO 00/21986 



PCT/US99/23315 



of SEQ ID NOs:l-20. A "fragment" can also refer to polypeptide sequences which are 
preferably at least 5 to about 1 5 amino acids in length, most preferably at least 10 amino acids 
long, and which retain some biological activity or immunological activity of, for example, a 
sequence selected from SEQ ID NOs:21-23. 
5 "Gene" or "gene sequence" refers to the partial or complete coding sequence of a 

transcript. The term also refers to sequences corresponding to 5' or 3 1 untranslated regions or 
5' or 3' untranslated regions including partial or complete coding sequences of a gene. 
Typically, the novel, gene sequences may or may not be homolgous to annotated sequences 
found in public or private databases. The gene may be in a sense or antisense 

10 (complementary) orientation. 

"Known matrix-remodeling gene" refers to a gene sequence which has been previously 
identified as useful in the diagnosis, treatment, prognosis, or prevention of diseases associated 
with matrix remodeling. Typically, this means that the known matrix-remodeling gene is 
expressed at higher levels in tissue abundant in known matrix-remodeling transcripts when 

15 compared with other tissue. 

"Matrix-remodeling gene" refers to a gene sequence whose expression pattern is 
similar to that of the known matrix-remodeling genes and which are useful in the diagnosis, 
treatment, prognosis, or prevention of diseases associated with matrix remodeling. The gene 
sequences can also be used in the evaluation of therapies for cancer. . 

20 "Substantially purified" refers to a nucleic acid or an amino acid sequence that is 

removed from its natural environment and is isolated or separated, and is at least about 60% 
free, preferably about 75% free, and most preferably about 90% free from other components 
with which it is naturally present. 
THE INVENTION 

25 The present invention encompasses a method for identifying biomolecules that are 

associated with a specific disease, regulatory pathway, subcellular compartment, cell type, 
tissue type, or species. In particular, the method identifies gene sequences useful in diagnosis, 
prognosis, treatment, prevention, and evaluation of therapies for diseases associated with 
matrix-remodeling, particularly, cancer, cardiomyopathy, arthritis, angiogenesis, diabetic 

30 necrosis, atherosclerosis, fibrosis, and ulceration.' 

The method provides first identifying polynucleotides that are expressed in a plurality 

« * 
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of cDNA libraries. The identified polynucleotides include genes of known function, genes 
known to be specifically expressed in a specific disease process, subcellular compartment, cell 
type, tissue type, or species. Additionally, the polynucleotides include genes of unknown 
function. The expression patterns of the known genes are then compared with those of the 
5 genes of unknown function to determine whether a specified coexpression probability 
threshold is met. Through this comparison, a subset of the polynucleotides for unknown 
function genes having a high coexpression probability with the known genes can be identified. 
The high coexpression probability correlates with a particular coexpression probability 
threshold which is less than 0.001, and more preferably less than 0.00001 . 

10 The polynucleotides originate from cDNA libraries derived from a variety of sources 

including, but not limited to, eukaryotes such as human, mouse, rat, dog, monkey, plant, and 
yeast and prokaryotes such as bacteria and viruses. These polynucleotides can also be selected 
from a variety of sequence types including, but not limited to, expressed sequence tags 
(ESTs), assembled polynucleotide sequences, full length gene coding regions, introns, 

15 regulatory sequences, 5' untranslated regions, and 3' untranslated regions. To have statistically 
significant analytical results, the polynucleotides need to be expressed in at least three cDNA 
libraries. 

The cDNA libraries used in the coexpression analysis of the present invention can be 
obtained from blood vessels, heart, blood cells, cultured cells, connective tissue, epithelium, 

20 islets of Langerhans, neurons, phagocytes, biliary tract, esophagus, gastrointestinal system, 
liver, pancreas, fetus, placenta, chromaffin system, endocrine glands, ovary, uterus, penis, 
prostate, seminal vesicles, testis, bone marrow, immune system, cartilage, muscles, skeleton, 
central nervous system, ganglia, neuroglia, neurosecretory system, peripheral nervous system, 
bronchus, larynx, lung, nose, pleurus, ear, eye, mouth, pharynx, exocrine glands, bladder, 

25 kidney, ureter, and the like. The number of cDNA libraries selected can range from as few as 
20 to greater than 10,000. Preferably, the number of the cDNA libraries is greater than 500. 

In a preferred embodiment, gene sequences are assembled to reflect related sequences, 
such as assembled sequence fragments derived from a single transcript. Assembly of the 
polynucleotide sequences can be performed using sequences of various types including, but 

30 not limited to, ESTs, extensions, or shotgun sequences. In a most preferred embodiment, the 
polynucleotide sequences are derived from human sequences that have been assembled using 
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the algorithm disclosed in "Database and System for Storing, Comparing and Displaying 
Related Biomolecular Sequence Information", Lincoln et al., Serial No:60/079,469, filed 
March 26, 1998, incorporated herein by reference. 

Experimentally, differential expression of the polynucleotides can be evaluated by 
5 methods including, but not limited to, differential display by spatial immobilization or by gel 
electrophoresis, genome mismatch scanning, representational difference analysis, and 
transcript imaging. Additionally, differential expression can be assessed by microarray 
technology. These methods may be used alone or in combination. 

Known matrix-remodeling genes can be selected based on the use of the genes as 

10 diagnostic or prognostic markers or as therapeutic targets for diseases associated with matrix 
remodeling, such as cancer, cardiomyopathy, arthritis, angiogenesis, diabetic necrosis, 
atherosclerosis, fibrosis, and ulceration. Preferably, the known matrix-remodeling genes 
include osteonectin (BM-40), chondroitin/dermatan sulfate proteoglycans (C/DSPG), 
collagen I, II, II, and IV, connective tissue growth factor (CTGF), fibrillin, fibronectins, 

15 fibronectin receptor (fibr-r), fibulin 1, heparan sulfate proteoglycans (HSPG), extracellular 
matrix protein (hevin), insulin-like growth factor .1 (IGF 1), insulin-like growth factor binding 
protein (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix metalloproteases 
(MMP), tissue inhibitors of matrix metalloproteinase 1, 2, and 3 (TIMP 1,2, and 3), and the 
like. 

20 The procedure for identifying novel genes that exhibit a statistically significant 

coexpression pattern with known matrix-remodeling genes is as follows. First, the presence or 
absence of a gene sequence in a cDNA library is defined: a gene is present in a cDNA library 
when at least one cDNA fragment corresponding to that gene is detected in a cDNA sample 
taken from the library, and a gene is absent from a library when no corresponding cDNA 

25 fragment is detected in the sample. 

Second, the significance of gene coexpressicm is evaluated using a probability method 
to measure a due-to-chance probability of the coexpression. The probability method can be 
the Fisher exact test, the chi-squared test, or the kappa test. These tests and examples of their 
applications are well known in the art and can be found in standard statistics texts (Agresti, A 

30 (1990) Categorical Data Analysis . John Wiley & Sons, New York NY; Rice, J A (1988) 

Mathematical Statistics and Data Analysis. Duxbury Press, Pacific Grove CA). A Bonferroni 
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correction (Rice, supra, page 384) can also be applied in combination with one of the 
probability methods for correcting statistical results of one gene versus multiple other genes. 
In a preferred embodiment, the due-to-chance probability is measured by a Fisher exact test, 
and the threshold of the due-to-chance probability is set to less than 0.001 , more preferably 
5 less than 0.00001. 

To determine whether two genes, A and B, have similar coexpression patterns, 
occurrence data vectors can be generated as illustrated in Table 1 , wherein a gene's presence is 
indicated by a one and its absence by a zero. A zero indicates that the gene did not occur in 
the library, and a one indicates that it occurred at least once. 



1 0 Table 1 . Occurrence data for genes A and B 





Library 1 


Library 2 


Library 3 




Library N 


gene A 


1 


1 


0 




0 


gene B 


1 


0 


1 




0 



15 For a given pair of genes, the occurrence data in Table 1 can be summarized in a 2 x 2 
contingency table. 

Table 2. Contingency table for co-occurrences of genes A and B 





Gene A present 


Gene A absent 


Total 


Gene B present 


8 


2 


10 


Gene B absent 


2 


18 


20 


Total 


10 


20 


30 



Table 2 presents co-occurrence data for gene A and gene B in a total of 30 libraries/Both gene 
A and gene B occur 10 times in the libraries. Table 2 summarizes and presents 1) the number 

25 of times gene A and B are both present in a l ibrary , 2) the number of times gene A and B are 
both absent in a library, 3) the number of times gene A is present while gene B is absent, and 
4) the number of times gene B is present while gene A is absent. The upper left entry is the 
number of times the two genes co-occur in a library, and the middle right entry is the number 
of times neither gene occurs in a library. The off diagonal entries are the number of times one 

30 gene occurs while the other does not. Both A and B are present eight times and absent 1 8 
times, gene A is present while gene B is absent two times, and gene B is present while gene A 
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is absent two times. The probability ("p-value") that the above association occurs due to 
chance as calculated using a Fisher exact test is 0.0003. Associations are generally considered 
significant if a p-value is less than 0.01 (Agresti, supra : Rice, supra) . 

This method of estimating the probability for coexpression of two genes makes several 
5 assumptions. The method assumes that the libraries are independent and are identically 
sampled. However, in practical situations, the selected cDNA libraries are not entirely 
independent because more than one library may be obtained from a single patient or tissue, 
and they are not entirely identically sampled because different numbers of cDNAs may be 
sequenced from each library (typically ranging from 5,000 to 10,000 cDNAs per library). In 

10 addition, because a Fisher exact coexpression probability is calculated for each gene versus 
41 ,419 other genes, a Bonferroni correction for multiple statistical tests is necessary. 

Using the method of the present invention, we have identified 20 novel genes that 
exhibit strong association, or coexpression, with known genes that are matrix-remodeling- 
specific. These known matrix-remodeling genes include osteonectin (BM-40), 

1 5 chondroitin/dermatan sulfate proteoglycans (C/DSPG), collagen I, II, II, and IV, connective 
tissue growth factor (CTGF), fibrillin, fibronectins, fibronectin receptor (fibr-r), fibulin 1, 
heparan sulfate proteoglycans (HSPG), extracellular matrix protein (hevin), insulin-like 
growth factor 1 (IGF 1), insulin-like growth factor binding protein (IGFBP), laminin, lumican, 
matrix Gla protein (MGP), matrix metalloproteases (MMP), and tissue inhibitors of matrix 

20 metalloproteinase 1 , 2, and 3 (TIMP 1, 2, and 3). The results presented in Tables 5 and 6 
show that the expression of the 20 novel genes have direct or indirect association with the 
expression of known matrix-remodeling genes. Therefore, the novel genes can potentially be 
used in diagnosis, treatment, prognosis, or prevention of diseases associated with matrix 
remodeling, or in the evaluation of therapies for diseases associated with matrix remodeling. 

25 Further, the gene products of the 20 novel genes are potential therapeutic proteins and targets 
of therapeutics against diseases associated with matrix remodeling. 

Therefore, in one embodiment, the present invention encompasses a polynucleotide 
sequence comprising the sequence of SEQ ID NOs: l-20. These 20 polynucleotides are shown 
by the method of the present invention to have strong coexpression association with known 

30 matrix-remodeling genes and with each other. The invention also encompasses a variant of 
the polynucleotide sequence, its complement, or 1 8 consecutive nucleotides of a sequence 
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provided in the above described sequences. Variant polynucleotide sequences typically have 
at least about 70%, more preferably at least about 85%, and most preferably at least about 95% 
polynucleotide sequence identity to NSEQ. 

One preferred method for identifying variants entails using NSEQ and/or PSEQ 
5 sequences to search against the GenBank primate (pri), rodent (rod), and mammalian (mam), 
vertebrate (vrtp), and eukaryote (eukp) databases, SwissProt, BLOCKS (Bairoch et al. (1997) 
Nucleic Acids Res 25:217-221), PFAM, and other databases that contain previously identified 
and annotated motifs, sequences, and gene functions. Methods that search for primary 
sequence patterns with secondary structure gap penalties (Smith et al. (1992) Protein 

10 Engineering 5:35-51) as well as algorithms s;uch as BLAST (Basic Local Alignment Search 
Tool; Altschul (1993) J Mol Evol 36:290-300; and Altschul et al. (1990) J Mol Biol 215:403- 
410), BLOCKS (Henikoffand Henikoff (1991) Nucleic Acids Res 19:6565-6572), Hidden 
Markov Models (HMM; Eddy (1996) Cur Opin Str Biol 6:361-365; Sonnhammer et al. (1997) 
Proteins 28:405-420), and the like, can be used to manipulate and analyze nucleotide and 

1 5 amino acid sequences. These databases, algorithms and other methods are well known in the 
art and are described in Ausubel et al. (1997; Short Protocols in Molecular Biology. John 
Wiley & Sons, New York NY) and in Meyers (1995; Molecular Biology and Biotechnology. 
Wiley VCH, New York NY, pp. 856-853). : , ; 

Also encompassed by the invention are polynucleotide sequences that are capable of 

20 hybridizing to SEQ ID NOs: 1 -20, and fragments thereof under stringent conditions. Stringent 
conditions can be defined by salt concentration, temperature, and other chemicals and 
conditions well known in the art. In particular, stringency can be increased by reducing the 
concentration of salt, or raising the hybridization temperature. Varying additional parameters, 
such as hybridization time, the concentration of detergent or solvent, and the inclusion or 

25 exclusion of carrier DNA, are well known to those skilled in the art. Additional variations on 
these conditions will be readily apparent to those skilled in the art (Wahl and Berger (1987) 
Methods Enzymol 152:399-407; Kimmel (1987) Methods Enzymol 152:507-51 1; Ausubel 
supra ; and Sambrook et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring 
Harbor Press, Plainview NY). 

30 NSEQ or the polynucleotide sequences encoding PSEQ can be extended utilizing a 

partial nucleotide sequence and employing various PCR-based methods known in the art to 
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detect upstream sequences, such as promoters and regulatory elements. (See, e.g., 
Dieffenbach and Dveksler (1995) PCR Primer, a Laboratory Manual. Cold Spring Harbor 
Press, Plainview NY; Sarkar (1 993) PCR Methods Applic 2:3 1 8-322; Triglia et al. (1988) 
Nucleic Acids Res 16:8186; Lagerstrom et al. (1991) PCR Methods Applic 1:1 1 1-119; and 
Parker et al. (1991) Nucleic Acids Res 19:3055-306). Additionally, one may use PCR, nested 
primers, and PROMOTERFINDER libraries (Clontech, Palo Alto, CA) to walk genomic 
DNA. This procedure avoids the need to screen libraries and is useful in finding intron/exon 
junctions. For all PCR-based methods, primers may be designed using commercially 
available software, such as OLIGO 4.06 Primer Analysis software (National Biosciences, 
Plymouth MN) or another appropriate program, to be about 18 to 30 nucleotides in length, to 
have a GC content of about 50% or more, and to anneal to the template at temperatures of 
about 68°C to 72°C. 

In another aspect of the invention, NSEQ or the polynucleotide sequences encoding 
PSEQ can be cloned in recombinant DNA molecules that direct expression of PSEQ or the 
polypeptides encoded by NSEQ, or structural or functional fragments thereof, in appropriate 
host cells. Due to the inherent degeneracy of the. genetic code, other DNA sequences which 
encode substantially the same or a functionally equivalent amino acid sequence may be 
produced and used to express the polypeptides of PSEQ or the polypeptides encoded by 
NSEQ. The nucleotide sequences of the present invention can be engineered using methods 
generally known in the art in order to alter the nucleotide sequences for a variety of purposes 
including, but not limited to, modification of the cloning, processing, and/or expression of the 
gene product. DNA shuffling by random fragmentation and PCR reassembly of gene 
fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences. 
For example, oligonucleotide-mediated site-directed mutagenesis may be used to introduce 
mutations that create new restriction sites, alter glycosylation patterns, change codon 
preference, produce splice variants, and so forth. 

In order to express a biologically active polypeptide encoded by NSEQ, NSEQ or the 
polynucleotide sequences encoding PSEQ, or derivatives thereof, may be inserted into an 
appropriate expression vector, i.e., a vector which "contains the necessary elements for 
transcriptional and translational control of the inserted coding sequence in a suitable host. 
These elements include regulatory sequences, such as enhancers, constitutive and inducible 
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promoters, and 5' and 3' untranslated regions in the vector and in NSEQ or polynucleotide 
sequences encoding PSEQ. Methods which are>well known to those skilled in the art may be 
used to construct expression vectors containing NSEQ or polynucleotide sequences encoding 
PSEQ and appropriate transcriptional and translational control elements. These methods 
5 include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 
recombination. (See, e.g., Sambrook (supra) and Ausubel ( supra) . 1 

A variety of expression vector/host cell systems may be utilized to contain and express 
NSEQ or polynucleotide sequences encoding PSEQ. These include, but are not limited to, 
microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or 

10 cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell 
systems infected with viral expression vectors (baculovirus); plant cell systems transformed 
with viral expression vectors, cauliflower mosaic virus (CaMV) or tobacco mosaic virus 
(TMV), or with bacterial expression vectors (Ti or pBR322 plasmids); or animal cell systems. 
The invention is not limited by the host cell employed. For long term production of 

1 5 recombinant proteins in mammalian systems, stable expression of a polypeptide encoded by 
NSEQ in cell lines is preferred. For example, NSEQ or sequences encoding PSEQ can be 
transformed into cell lines using expression vectors which may contain viral origins of 
replication and/or endogenous expression elements and a selectable marker gene on the same 
or on a separate vector. 

20 In general, host cells that contain NSEQ and that express PSEQ may be identified by a 

variety of procedures known to those of skill in the art. These procedures include, but are not 
limited to, DNA-DNA or DNA-RNA hybridizations, PCR amplification, and protein bioassay 
or immunoassay techniques which include membrane, solution, or chip based technologies for 
the detection and/or quantification of nucleic acid or protein sequences. Immunological 

25 methods for detecting and measuring the expression of PSEQ using either specific polyclonal 
or monoclonal antibodies are known in the aft. Examples of such techniques include 
enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and 
fluorescence activated cell sorting (FACS). 

Host cells transformed with NSEQ or polynucleotide sequences encoding PSEQ may 

30 be cultured under conditions suitable for the expression and recovery of the protein from cell 
culture. The protein produced by a transformed cell may be secreted or retained intracellular^ 
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depending on the sequence and/or the vector used. As will be understood by those of skill in 
the art, expression vectors containing polynucleotides of NSEQ or polynucleotides encoding 
PSEQ may be designed to contain signal sequences which direct secretion of PSEQ or 
polypeptides encoded by NSEQ through a prokaryotic or eukaryotic cell membrane. 
5 In addition, a host cell strain may be chosen for its ability to modulate expression of 

the inserted sequences or to process the expressed protein in the desired fashion. Such 
modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, 
glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which 
cleaves a "prepro" form of the protein may also be used to specify protein targeting, folding, 

1 0 and/or activity. Different host cells which have specific cellular machinery and characteristic 
mechanisms for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38), 
are available from the American Type Culture Collection (ATCC, Manassas VA) and may be 
chosen to ensure the correct modification and processing of the foreign protein. 

In another embodiment of the invention, natural, modified, or recombinant NSEQ or 

15 nucleic acid sequences encoding PSEQ are ligated to a heterologous sequence resulting in 
translation of a fusion protein containing heterologous protein moieties in any of the 
aforementioned host systems. Such heterologous protein moieties facilitate purification of 
fusion proteins using commercially available affinity matrices. Such moieties include, but are 
not limited to, glutathione S-transferase (GST), maltose binding protein (MBP), thioredoxin 

20 (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc, hemagglutinin (HA) and 
monoclonal antibody epitopes.. 

In another embodiment, NSEQ or sequences encoding PSEQ are synthesized, in whole 
or in part, using chemical methods well known in the art. (See, e.g., Caruthers et al. (1980) 
Nucleic Acids Symp Ser (7) 215-223; Horn et al. (1980) Nucleic Acids Symp Ser (7) 225-232; 

25 and Ausubel, supra) . Alternatively, PSEQ or a polypeptide sequence encoded by NSEQ itself, 
or a fragment thereof, may be synthesized using chemical methods. For example, peptide 
synthesis can be performed using various solid-phase techniques (Roberge et al. (1995) 
Science 269:202-204). Automated synthesis may be achieved using the ABI 43 1 A Peptide 
synthesizer (PE Biosystems, Foster City CA). Additionally, PSEQ or the amino acid 

30 sequence encoded by NSEQ, or any part thereof, may be altered during direct synthesis and/or 
combined with sequences from other proteins, or any part thereof, to produce a polypeptide 
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variant. 

In another embodiment, the invention provides a substantially purified polypeptide 
comprising the amino acid sequence selected from the group consisting of SEQ ID NO:2 1 , 
SEQ ID NO:22, SEQ ID NO:23 or fragments thereof. 
5 DIAGNOSTICS and THERAPEUTICS 

The sequences of the these genes can be used in diagnosis, prognosis, treatment, 
prevention, and evaluation of therapies for diseases associated with matrix-remodeling, 
particularly cancer, cardiomyopathy, arthritis, angiogenesis, diabetic necrosis, atherosclerosis, 
fibrosis, and ulceration. Further, the amino acid sequences encoded by the novel genes are 
10 potential therapeutic proteins and targets of anti-cancer therapeutics or for the treatment of 
other diseases associated with matrix remodeling. 

In one preferred embodiment, the polynucleotide sequences of NSEQ or the 
polynucleotides encoding PSEQ are used for diagnostic purposes to investigate the altered 
expression of PSEQ, and to monitor regulation of the levels of mRNA or the polypeptides 
15 encoded by NSEQ during therapeutic intervention. The polynucleotides may be at least 1 8 
nucleotides long, and may be complementary RN A or DNA molecules, branched nucleic 
acids, or peptide nucleic acids (PNAs). Alternatively, the polynucleotides are used to detect 
and quantitate gene expression in samples in which expression of PSEQ or the polypeptides 
encoded by NSEQ are correlated with disease. Additionally, NSEQ or the polynucleotides 
20 encoding PSEQ can be used to detect genetic polymorphisms associated with a disease. These 
polymorphisms may be detected at the transcript cDNA or genomic level. 

The specificity of the probe, whether it is made from a highly specific region, e.g., the 
5' regulatory region, or from a less specific region, e.g., a conserved motif, and the stringency 
of the hybridization or amplification (maximal, high, intermediate, or low), will determine 
25 whether the probe identifies only naturally occurring sequences encoding PSEQ, allelic 
variants, or related sequences. 

Probes may also be used for the detection of related sequences, and should preferably 
have at least 70% sequence identity to any of the NSEQ or PSEQ-encoding sequences. 

Means for producing specific hybridization probes for DNAs encoding PSEQ include 
30 the cloning of NSEQ or polynucleotide sequences encoding PSEQ into vectors for the 

production of mRNA probes. Such vectors are known in the art, are commercially available, 
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and may be used to synthesize RNA probes in vitro by means of the addition of the 
appropriate RNA polymerases and the appropriate labeled nucleotides. Hybridization probes 
may be labeled by a variety of reporter groups, for example, by radionuclides such as 32 P or 
35 S, or by enzymatic labels, such as alkaline phosphatase coupled to the probe via avidin/biotin 
5 coupling systems, by fluorescent labels and the like. The polynucleotide sequences encoding 
PSEQ may be used in Southern or northern analysis, dot blot, or other membrane-based 
technologies; in PCR technologies;and in microarrays utilizing fluids or tissues from patients 
to detect altered NSEQ expression. Such qualitative or quantitative methods are well known 
in the art. 

10 NSEQ or the nucleotide sequences encoding PSEQ can be labeled by standard 

methods and added to a fluid or tissue sample from a patient under conditions suitable for the 
formation of hybridization complexes. After a suitable incubation period, the sample is 
washed and the signal is quantitated and compared with a standard value, typically, derived 
from a non-diseased sample. If the amount of signal in the patient sample is significantly 

15 altered in comparison to the standard value then the presence of altered levels of nucleotide 
sequences of NSEQ and those encoding PSEQ in the sample indicates the presence of the 
associated disease. Such assays may also be used to evaluate the efficacy of a particular 
therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the treatment 
of an individual patient. 

20 Once the presence of a disease is established and a treatment protocol is initiated, 

hybridization or amplification assays can be repeated on a regular basis to determine if the 
level of expression in the patient begins to approximate that which is observed in a healthy 
subject. The results obtained from successive assays may be used to show the efficacy of 
treatment over a period ranging from several days to months. 

25 The polynucleotides may be used for the diagnosis of a variety of diseases associated 

with matrix-remodeling including cancer such as adenocarcinoma, leukemia, lymphoma, 
melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal 
gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal 
tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary 

30 glands, skin, spleen, testis, thymus, thyroid, and uterus, cardiomyopathy, arthritis, 
angiogenesis, diabetic necrosis, atherosclerosis, fibrosis, and ulceration. 
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s 

Alternatively, the polynucleotides may be used as targets in a microarray. The 
microarray can be used to monitor the expression level of large numbers of genes 
simultaneously and to identify splice variants, mutations, and polymorphisms. This 
information may be used to determine gen0 function, to understand the genetic basis of a 
5 disease, to diagnose a disease, and to develop and monitor the activities of therapeutic agents. 
In yet another alternative, polynucleotides may be used to generate hybridization 
probes useful in mapping the naturally occurring genomic sequence. Fluorescent in situ 
hybridization (FISH) may be correlated with other physical chromosome mapping techniques 
and genetic map data. (See, e.g., Heinz-Ulrich et al. (1995) in Meyers, supra , pp. 965-968.) 

1 0 In another embodiment, antibodies which specifically bind PSEQ may be used for the 

diagnosis of diseases characterized by the over-or-underexpression of PSEQ or polypeptides 
encoded by NSEQ. A variety of protocols for measuring PSEQ or the polypeptides encoded 
by NSEQ, including ELISAs, RIAs, and FACS, are well known in the art and provide a basis 
for diagnosing altered or abnormal levels of the expression of PSEQ or the polypeptides 

15 encoded by NSEQ. Standard values for PSEQ expression are established by combining body 
fluids or cell extracts taken from healthy subjects, preferably human, with antibody to PSEQ 
or a polypeptide encoded by NSEQ under conditions suitable for complex formation The 
amount of complex formation may be quantitated by various methods, preferably by 
photometric means. Quantities of PSEQ or the polypeptides encoded by NSEQ expressed in 

20 disease samples from, for example, biopsied tissues are compared with the standard values. 
Deviation between standard and subject values establishes the parameters for diagnosing or 
monitoring disease. Alternatively, one may use competitive drug screening assays in which 
neutralizing antibodies capable of binding PSEQ or the polypeptides encoded by NSEQ 
specifically compete with a test compound for binding the polypeptides. Antibodies can be 

25 used to detect the presence of any peptide which shares one or more antigenic determinants 
with PSEQ or the polypeptides encoded by NSEQ. 

In another aspect, the polynucleotides arid polypeptides of the present invention can be 
employed for treatment or the monitoring of therapeutic treatments for cancer. The 
polynucleotides of NSEQ or those encoding PSEQ, or any fragment or complement thereof, 

30 may be used for therapeutic purposes. In one aspect, the complement of the polynucleotides 
of NSEQ or those encoding PSEQ may be used in situations in which it would be desirable to 
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block the transcription or translation of the mRNA. 

Expression vectors derived from retroviruses, adenoviruses, or herpes or vaccinia 
viruses, or from various bacterial plasmids, may be used for delivery of nucleotide sequences 
to the targeted organ, tissue, or cell population; Methods which are well known to those 
5 skilled in the art can be used to construct vectors to express nucleic acid sequences 
complementary to the polynucleotides encoding PSEQ. (See, e.g., Sambrook, supra : and 
Ausubel, supra .) 

Genes having polynucleotide sequences of NSEQ or those encoding PSEQ can be 
turned off by transforming a cell or tissue with expression vectors which express high levels of 

10 a polynucleotide, or fragment thereof, encoding PSEQ. Such constructs may be used to 
introduce untranslatable sense or antisense sequences into a cell. Oligonucleotides derived 
from the transcription initiation site, e.g., between about positions -10 and +10 from the start 
site, are preferred. Similarly, inhibition can be achieved using triple helix base-pairing 
methodology. Triple helix pairing is useful because it causes inhibition of the ability of the 

15 double helix to open sufficiently for the binding of polymerases, transcription factors, or 
regulatory molecules. Recent therapeutic advances using triplex DNA have been described in 
the literature. (See, e.g., Gee et al. (1994) In: Huber and Carr, Molecular and Immunologic 
A pproaches . Futura Publishing, Mt. Kisco NY, pp. 163-177.) Ribozymes, enzymatic RNA 
molecules, may also be used to catalyze the specific cleavage of RNA. 

20 RNA molecules may be modified to increase intracellular stability and half-life. 

Possible modifications include, but are not limited to, the addition of flanking sequences at the 
5' and/or 3* ends of the molecule, or the use of phosphorothioate or 2' O-methyl rather than 
phosphodiesterase linkages within the backbone of the molecule. Alternatively, nontraditional 
bases such as inosine, queosine, and wybutosihe, as well as acetyl-, methyl-, thio-, and 

25 similarly modified forms of adenine, cytidine, guanine, thymine, and uridine which are not as 
easily recognized by endogenous endonucleases may be included. 

Many methods for introducing vectors into cells or tissues are available and equally 
suitable for use in vivo, in vitro, and ex vivo ; For ex vivo therapy, vectors may be introduced 
into stem cells taken from the patient and clonally propagated for autologous transplant back 

30 into that same patient. Delivery by transfection, by liposome injections, or by polycationic 
amino polymers may be achieved using methods which are well known in the art. (See, e.g., 
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Goldman et al. (1997) Nature Biotechnology 15:462-466.) 

Further, an antagonist or antibody of a polypeptide of PSEQ or encoded by NSEQ may 
be administered to a subject to treat or prevent a cancer associated with increased expression 
or activity of PSEQ. An antibody which specifically binds the polypeptide may be used 
5 directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissue which express the the polypeptide. 

Antibodies to PSEQ or polypeptides encoded by NSEQ may also be generated using 
methods that are well known in the art. Such antibodies may include, but are not limited to, 
polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and fragments 

10 produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit dimer 
formation) are especially preferred for therapeutic use. Monoclonal antibodies to PSEQ may 
be prepared using any technique which provides for the production of antibody molecules by 
continuous cell lines in culture. These include but are not limited to, the hybridoma 
technique, the human B-cell hybridoma technique, and the EBV-hybridoma technique. In 

1 5 addition, techniques developed for the production of chimeric antibodies can be used. (See, 
for example, Meyers, supra.) Alternatively, techniques described for the production of single 
chain antibodies may be employed. Antibody fragments which contain specific binding sites 
for PSEQ or the polypeptide sequences encoded by NSEQ may also be generated. 

Various immunoassays may be used for screening to identify antibodies having the 

20 desired specificity. Numerous protocols for competitive binding or immunoradiometric 

assays using either polyclonal or monoclonal antibodies with established specificities are well 
known in the art. 

Yet further, an agonist of a polypeptide of PSEQ or that encoded by NSEQ may be 
administered to a subject to treat or prevent a cancer associated with decreased expression or 
25 activity of the polypeptide. 

An additional aspect of the invention relates to the administration of a pharmaceutical 
or sterile composition, in conjunction with a pharmaceutical^ acceptable carrier, for any of 
the therapeutic effects discussed above. Such pharmaceutical compositions may consist of 
polypeptides of PSEQ or those encoded by NSEQ; antibodies to the polypeptides, and 
30 mimetics, agonists, antagonists, or inhibitors 6f the polypeptides. The compositions may be 
administered alone or in combination with at least one other agent, such as a stabilizing 
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compound, which may be administered in any sterile, biocompatible pharmaceutical carrier 
including, but not limited to, saline, buffered, saline, dextrose, and water. The compositions 
may be administered to a patient alone, or in combination with other agents, drugs, or 
hormones. 

5 The pharmaceutical compositions utilized in this invention may be administered by 

any number of routes including, but not limited to, oral, intravenous, intramuscular, 
intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, 
intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions may contain 
10 suitable pharmaceutically-acceptable carriers comprising excipients and auxiliaries which 
facilitate processing of the active compounds into preparations which can be used 
pharmaceutical^. Further details on techniques for formulation and administration may be 
found in the latest edition of Remington's Pharmaceutical Sciences fMaack Publishing, Easton 
PA). 

1 5 For any compound, the therapeutically effective dose can be estimated initially either 

in cell culture assays, e.g., of neoplastic cells or in animal models such as mice, rats, rabbits, 
dogs, or pigs. An animal model may also be used to determine the appropriate concentration 
range and route of administration. Such information can then be used to determine useful 
doses and routes for administration in humans. 

20 A therapeutically effective dose refers to that amount of active ingredient, for example, 

polypeptides of PSEQ or those encoded by NSEQ, or fragments thereof, antibodies of the 
polypeptides, and agonists, antagonists or inhibitors of the polypeptides, which ameliorates the 
symptoms or condition. Therapeutic efficacy and. toxicity may be determined by standard 
pharmaceutical procedures in cell cultures or with experimental animals, such as by 

25 calculating the ED 50 (the dose therapeutically effective in 50% of the population) or LD 50 (the 
dose lethal to 50% of the population) statistics. 

- i 

Any of the therapeutic methods described above may be applied to any subject in need 
of such therapy, including, for example, mammals such as dogs, cats, cows, horses, rabbits, 
monkeys, and most preferably, humans. 
30 EXAMPLES 

It is understood that this invention is not limited to the particular methodology, 
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protocols, and reagents described, as these may vary.- It is also understood that the 
terminology used herein is for the purpose of describing particular embodiments only, and is 
not intended to limit the scope of the present invention which will be limited only by the 
appended claims. The examples below are provide to illustrate the subject invention and are 
not included for the purpose of limiting the invention. 
I * cDNA Library Construction 

The cDNA library, THYMFET02, was selected to demonstrate the construction of the 
cDNA libraries from which novel matrix remodeling genes were derived. The THYMFET02 
cDNA library was constructed from microscopically normal thymus tissue obtained from a 
Caucasian female fetus who died at 17 weeks gestation from anencephaly. Serology was 
negative; family history included tobacco abuse and gastritis. 

The frozen tissue was homogenized and lysed in TRIZOL reagent (1 gm tissue/ 10 ml 
TRIZOL; Life Technologies, Rockville MD), a monoplastic solution of phenol and guanidine 
isothiocyanate, using a POLYTRON homogenizer (PT-3000; Brinkmann Instruments, 
Westbury NY). After a brief incubation on ice, chloroform was added (1 :5 v/v), and the lysate 
was centrifuged. The upper chloroform layer was removed, and the RNA was precipitated 
with isopropanol, resuspended in DEPC-treated water, and treated with DNase for 25 min at 
37 °C. The mRNA was reextracted once with acid phenol-chloroform pH 4.7 and precipitated 
using 0.3 M sodium acetate and 2.5 volumes ethanol. The mRNA was isolated using the 
OLIGOTEX kit (Qiagen, Chatsworth CA) and used to construct the cDNA library. 

The mRNA was handled according to the recommended protocols in the 
SUPERSCRIPT Plasmid system (Life Technologies). The cDNAs were fractionated on a 
SEPHAROSE CL4B column (Amersham Pharmacia Biotech, Pisctaway NJ), and those 
cDNAs exceeding 400 bp were ligated into pINCY 1 plasmid (Incyte Pharmaceuticals, Palo 
Alto CA) . The plasmid was subsequently transformed into DH5a competent cells (Life 
Technologies). 

II Isolation and Sequencing of cDNA Clones 

Plasmid DNA was released from the cells and purified using the REAL Prep 96 
Plasmid kit ( Qiagen). This kit enabled the simultaneous purification of 96 samples in a 96- 
well block using multi-channel reagent dispensers. The recommended protocol was employed 
except for the following changes: 1) the bacteria were cultured in 1 ml of sterile Terrific Broth 
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( Life Technologies) with carbenicillin at 25 rrig/L and glycerol at 0.4%; 2) after inoculation, 
the cultures were incubated for 19 hours and at the end of incubation, the cells were lysed with 
0.3 ml of lysis buffer; and 3) following isopropanol precipitation, the plasmid DNA pellet was 
resuspended in 0.1 ml of distilled water. After the last step in the protocol, samples were 
transferred to a 96-well block for storage at 4 °G. 

The cDNAs were prepared using a MICROLAB 2200 (Hamilton, Reno NV) in 
combination with DNA ENGINE thermal cyclers (PTC200; MJ Research, Watertown MA) 
and sequenced by the method of Sanger et al. (1975, J Mol Biol 94:44 If) using ABI PRISM 
377 DNA sequencing systems. 

Ill Selection, Assembly, and Characterization of Sequences 

The sequences used for coexpression analysis were assembled from EST sequences, 5' 
and 3* longread sequences, and full length coding sequences. Selected assembled sequences 
were expressed in at least three cDNA libraries. 

The assembly process is described as follows. EST sequence chromatograms were 
processed and verified. Quality scores were obtained using PHRED (Ewing et al. (1998) 
Genome Res 8:175-185; Ewing and Green (1998) Genome Res 8:186-194). Then the edited 
sequences were loaded into a relational database management system (RDBMS). The EST 
sequences were clustered into an initial set of bins using BLAST with a product score of 50. 
All clusters of two or more sequences were created as bins. The overlapping sequences 
represented in a bin correspond to the sequence of a transcribed gene. 

Assembly of the component sequences Within each bin was performed using a 
modification of PHRAP, a publicly available program for assembling DNA fragments (Phil 
Green, University of Washington, Seattle WA). Bins that showed 82% identity from a local 
pair-wise alignment between any of the consensus sequences were merged. 

Bins were annotated by screening the consensus sequence in each bin against public 
databases, such as GBpri and GenPept from NCBI. The annotation process involved a FASTn 
screen against the GBpri database in GenBank. Those hits with a percent identity of greater 
than or equal to 70% and an alignment length of greater than or equal to 100 base pairs were 
recorded as homolog hits. The residual unannotated sequences were screened by FASTx 
against GenPept. Those hits with an E value of less than or equal to 1 0 -8 are recorded as 
homolog hits. 
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Sequences were then reclustered using BLASTn and Cross-Match, a program for rapid 
protein and nucleic acid sequence comparison and database search (Green, supra) , 
sequentially. Any BLAST alignment between a sequence and a consensus sequence with a 
score greater than 1 50 was realigned using cross-match. The sequence was added to the bin 
5 whose consensus sequence gave the highest Smith- Waterman score amongst local alignments 
with at least 82% identity. Non-matching* sequences created new bins. The assembly and 
consensus generation processes were performed for the new bins. 
IV Coexpression Analyses of Known Matrix-remodeling Genes 

Twenty one known matrix-remodeling genes were selected to identify novel genes that 
10 are closely associated with matrix remodeling. The known genes were osteonectin (BM-40), 
chondroitin/dermatan sulfate proteoglycans (C/DSPG), collagen I, II, II, and IV (coll-I, coll-II, 
and coll-III), connective tissue growth factor (CTGF), fibrillin, fibronectins, fibronectin 
receptor (fibr-r), fibulin 1, heparan sulfate proteoglycans (HSPG), extracellular matrix protein 
(hevin), insulin-like growth factor 1 (IGF 1), insulin-like growth factor binding protein 
15 (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix metalloproteases (MMP), and 
tissue inhibitors of matrix metalloproteinase 1, 2, and 3 (TIMP 1, 2, and 3). The protein 
products of the known matrix-remodeling genes may be categorized as follows. 

1 . Extracellular matrix component protein. These proteins include collagens, 
proteoglycans, fibrillin, fibronectin, fibulin, and laminin that constitute the major structures of 
the extracellular matrix. 

2. Matrix proteases and matrix protease inhibitors. These proteins include matrix 
metalloproteases (MMPs) such as the collagenases, and MMP inhibitors such as the tissue- 
inhibitors of matrix metalloproteases (TIMPs). 

3. Regulatory proteins that control expression of matrix-remodeling genes. Such 
regulatory proteins include connective tissue growth factor, insulin-like growth factor, 
osteonectin (BM-40), and the receptors for and inhibitors of these proteins. 

The known matrix-remodeling genes that we examined in this analysis, and brief 
descriptions of their functions, are listed in Table 4. Detailed descriptions of their roles in 
matrix remodeling may be found in the cited Articles and reviews. 

Table 4. Known Matrix-remodeling Genes. 
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Gene 



Description & References 



BM-40 



Alternate names: SPARC, osteonectin 

Regulates connective tissue remodeling, wound healing, angiogenesis 
Induces matrix metalloprotease synthesis (collagenase & gelatinase) 
Regulates cell movement and proliferation 

Expression increased in neoplastic melanoma, fibrosis, angiogenesis. 
(Kamihagi et al. (1994) Biochem Biophys Res Commun 200:423-8; Lane et 
al. (1994) J Cell Biol 125:929 : 43; Inagaki e,t al. (1996) Life Sci 58:927-34; 
Ledda et al. (1997) J Invest Dermatol 108:210-4; Shankavaram et al. (1997) J 
Cell Physiol 173:327-34.) 



C/DSPG 



10 



Collagens 



15 



Chondroitin/dermatan sulfate proteoglycans 

Major extracellular matrix proteoglycan 

Regulate cell proliferation, attachment and migration. 

Darnell et al. (1990) Molecular Cell Biology . Scientific American Press, New 

York NY; Toole (1991) In: Cell Biology of Extracellular Matrix . Plenum, 

New York NY, pp. 305-34 1 ; Beck et al. (1 993) Biochem Biophys Res 

Commun 190:616-23) 

Family of fibrous structural proteins (collagen I, II, III, IV, etc.) 

Most abundant structural component of the extracellular matrix 

Secreted as procollagen; converted to collagen by MMPs 

(Alexander and Werb (1991) In: Cell Biology of Extracellular Matrix , pp. 

255-302; Adams (1993) In: Extracellular Matrix . Marcel Dekker, New York, 

NY pp. 91-1 19; Schuppan et ak (1993) In: Extracellular Matrix , pp. 201- 

254.) 



CTGF 



20 



fibrillin 



Connective tissue growth factor . : 

Mediates induction of matrix synthesis and fibrosis 

(Grotendorst (1997) Cytokine Growth Factor Rev 8:171-9; Oemar and 

Luscher (1997) Arteriosder Thromb Vase Biol 17:1483-9; Ito et al. (1998) 

Kidney Int 53:853-61.) 

Major component of extracellular microfibril Is (matrix elastic network) 
Present in connective tissue throughout the body 

(Kielty and Shuttleworth (1995) Int J Biochem Cell Biol 27:747-60; Haynes 
et al. ( 1 997) Br J Dermatol 137: 17-23; Hayward and Brock ( 1 997) Hum 
Mutat 10:415-23.) 



25 fibronectins 



fibr-r 



30 



Family of extracellular matrix glycoproteins 

Anchor cells to the matrix ; 

Bind matrix proteins to cell surface receptors 

Fibronectin receptor 

Fibronectin receptors regulate cell adhesion & migration 
(Darnell et al. (1990) Molecular Cell Biology . Scientific American Press, 
New York NY; Ruoslahti (1991) Cell Biology of Extracellular Matrix , pp. 
343-363; Yamada (1991) Cell Biology of Extracellular Matrix , pp. 1 1 1-146.) 



fibulin 1 



Fibronectin-binding extracellular matrix protein 
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Mediates platelet adhesion via a bridge of fibrinogen 
Cleaved by matrix metalloproteinases 
Inhibits breast and ovarian cancer cell motility 

(Argraves et al. (1990) J CelJ Biol 1 1 1:3155-64; Sasaki et al. (1996) Eur J 
Biochem 240:427-34; Hayashido et al. (1998) Int J Cancer 75:654-8.) 



HSPG 



10 



hevin 



15 



IGF1 



20 



Heparan sulfate proteoglycans 

Extracellular matrix proteoglycan found on cell surface of many cell types 

Regulate cell interactions with the extracellular matrix i 

Bind to collagens and flbronectin in the matrix 

Regulate cell proliferation, attachment and migration 

(Darnell et al. (1990) ; Toole (1991 ) In: Cell Biology of Extracellular Matrix . 

pp. 305-341; Schuppan et al. (1993) In: Extracellular Matrix , pp. 201-254.) 

Extracellular matrix protein 

Homolog to BM-40 

Regulates cell adhesion and migration 

Downregulated in metastatic prostate cancer, lung cancer 

(Girard and Springer (1996) J Biol Chem 27 1:45 11 -7; Bendik et al. Cancer 

Res 58:232-6.) 

Insulin-like growth factor . j- 

Regulates matrix homeostasis and remodeling 

Regulates aggregation, growth and survival of cancer cells 

(Aston et al. (1995) Am J Respir Crit Care Med 151 : 1597-603; Bitar and 

Labbad (1996) J Surg Res 61:113-9; Guvakova and Surmacz (1997) Exp Cell 

Res 23 1 : 149-62; Sunic et al:'( 1998) Endocrinology 139:2356-62.) 



IGFBP 



25 



Insulin-like growth factor binding protein 

Regulates IGF-1 bioavailability (binds IGF-1 more strongly than the receptor) 
Degraded by matrix metalloproteases 

(Kiefer et al. (1991) Biochem Biophys Res Commun 176:219-25; Fowlkes et 
al. (1 995) Prog Growth Factor Res 6:255-63; Parker et al. (1996) J Biol Chem 
273:13523-9.) 



laminin 



30 



lumican 



35 



Major protein in basal lamina, with collagen, HSPG, and entactin 
Anchors cells to the matrix by binding collagen, HSGP and heparin 
Laminins and collagens are the main targets of MMPs 
Regulates cell attachment, migration, growth, and differentiation 
(Yamada et al. (1993) In: Extracellular Matrix , pp. 49-66; Giannelli et al. 
(1997) Science 277:225-8; Quaranta and Plopper ( 1 997) Kidney Int 51 : 1441- 
6; Soini et al. (1997) Hum Pathol 28:220-6.) 

•. . • 

Extracellular proteoglycan 

Organizes collagen fibrils in extracellular matrix 

(Dourado et al. (1996) Osteoarthritis Cartilage 4:187-96; Scott (1996) Bio- 
chemistry 35:8795-9; Cs-Szabo et al. (1997) Arthritis Rheum 40:1037-45.) 



MGP 



Matrix G la protein 

Regulates calcification of cartilage 
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Marker for osteoblast activity . 

(Shanahan et al. (1994) J Clin Invest 93:2393-402; Luo et al. (1997) Nature 
386:78-81; Martinetti et al. (1997) Tumour Biol 18:197-205) 



MMP 



Family of Matrix Metal loproteases (including collagenases) 
Cleave procollagen to produce collagen 

(Alexander and Werb (1991) ln:- Cell Biology of Extracellular Matrix , pp. 
255-302; Adams (1993) In: Extracellular Matrix , pp. 91-1 19; Schuppan et al. 
(1993) In: Extracellular Matrix pp. 201-254.) 



TIMP 1,2,3 



10 



Tissue inhibitors of matrix metal loproteinases 
Bind and inactivate matrix proteases 

(Schuppan et al. (1993) In: Extracellular Matrix , pp. 201-254; Zvibel and 
Kraft (1993) In: Extracellular Matrix , pp. 559-580.) 



The coexpression of the 21 known genes with each other is shown in Table 5. The 
entries in Table 5 are the negative log of the p-value (- log p) for the coexpression of the two 
genes. As shown, the method successfully identified the strong association of the known genes 
15 among themselves, indicating that the coexpression analysis method of the present invention 
was effective in identifying genes that are closely associated with matrix remodeling. 

Table 5. Coexpression of 21 known f matrix-remodeling genes. (- log/?) 
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45 V Novel Genes Associated with Matrix Remodeling 

Using coexpression analysis, we have identified 20 novel genes that show strong 
association with known matrix remodeling genes from a total of 41 ,41 9 assembled gene 
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sequences. The degree of association was measured by probability values and has a cutoff of 
p value less than 0.00001. This was followed by annotation and literature searches to insure 
that the genes that passed the probability test have strong association with known matrix- 
remodeling genes. This process was reiterated so that the initial 41,419 genes were reduced to 

5 the final 20 matrix-remodeling genes. Details of the coexpression patterns for the 20 novel 
matrix-remodeling genes are presented in Table 6. 

Each of the 20 novel genes is coexpressed with at least two of the 21 known genes 
with a p- value of less than 10* 7 . The coexpression results are shown in Table 6. 
The novel genes identified are listed in the table by their Incyte clone numbers (Clone), and 

10 the known genes their abbreviated names (Gene) as shown in Example IV. 

Table 6. Coexpression of 20 novel genes with known matrix-remodeling genes. (- log p) 
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VI Novel Genes Associated with Matrix Remodeling 

The 20 novel genes were identified from the data shown in Table 6 to be associated with 
matrix remodeling. 

40 The nucleotide sequences comprising the consensus sequences of SEQ ID NOs: 1 -20 of the 

present invention were first identified from Incyte Clones 606132, 627722, 639644, 1362659, 
1446685, 1556751, 1656953, 1662318, 1996726, 2137155, 2268890, 2305981,2457612,2814981, 
3089150, 3206667, 3284695, 3481610, 3722004, and 3948614, respectively, and assembled according 
to Example III. BLAST and other motif searches were performed for SEQ ID NOs: 1-20 according to 

45 Example VII. The sequences of SEQ ID NOs: 1-20 were translated and sequence identity was sought 
with known sequences. Polypeptide sequences comprising the consensus sequences of SEQ ID 
NO:21, SEQ ID NO:22, and SEQ ID NO:23 of the present invention were encoded by SEQ ID NO:2, 
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SEQ ID NO:6, and SEQ ID NO: 1 1 , respectively. SEQ ID NOs:2 1 -23 were analyzed using BLAST 
and other motif search tools as disclosed in Example VII. 

SEQ ID NO:3 is 2987 residues in length arid shows about 59% sequence identity from about 
nucleotide 2117 to about nucleotide 2914 with the cDNA encoding regulatory subunit of a human 
5 cAMP-dependent protein kinase, RJIbeta (WO 88/03 1 64). SEQ ID NO:8 is 301 7 nucleotides in length 
and shows about 70% to about 74% sequence identity from about nucleotide 1 to about nucleotide 
1260 and about nucleotide 1925 to about nucleotide 1985 with human Hpast mRN (g2529706), a gene 
associated with multiple endocrine neoplasia type 1 . SEQ ID NO:9 is 1 735 nucleotides in length and 
shows about 25% sequence identity from about nucleotide 5 to about nucleotide 1534 with a human 

10 neuronal cell adhesion molecule (WO 96/04396) important in the development of nervous system by 
promoting cell-cell adhesion. SEQ ID NO: 14 is 2040 nucleotides in length and shows about 60% to 
70% sequence identity from about nucleotide 1 to about nucleotide 1023 with a human mRNA for a 
serine protease (gl 62 1 243) specific for insulin-like growth factor-binding proteins. The amino acid 
sequence encoded by SEQ ID NO: 14 from about nucleotide 3 to about nucleotide 1043 shows about 

15 61% sequence identity with an osteoblast- 1 ike cell-derived protein (J09 107980) useful for treatment 
and prevention of various diseases and as contraceptive. SEQ ID NO: 15 is 2121 nucleotides in length 
and shows 60-80% sequence identity with a mouse gene, ADAMT-1 (g2809056), a member of the 
ADAM ( the disintegrin and metal loproteinase) family. ADAMT-1 has been shown to contain the 
thrombospondin (TSP) type I motif; expression of ADAMT-1 is closely associated with inflammatory 

20 processes (Kuno et al (1997) Genomics 46:466-471). SEQ ID NO: 16 is 2900 nucleotides in length 
and shows about 70% sequence identity with a mouSe Hqmeobox (Pmx) mRNA (g460124). 
Homeobox genes are expressed in very specific temporal and spatial pattern and function as 
transcriptional regulators of developmental processes (Kern et al. (1994) Genomics 19:334-340). 

SEQ ID NO:2 1 is 55 1 amino acid residues long and shows about 37% sequence identity from 

25 about amino acid residue 10 to about amino acid residue 278 with PALM (g3219602), a human 

paralemin that is membrane-bound and expressed abundantly in brain and at intermediate levels in the 
kidney and in endocrine cells. In addition, the sequence encompassing residues 41 8 to 434 of SEQ ID 
NO:21 resembles one of the structural fingerprint regions of a seven trans-membrane receptor, LCR1, 
that is isolated from the human brain (Rimland et al. (1991) Mol Pharmacol 40:869-875). SEQ ID 

30 NO:21 also has one potential amidation site at L546; .three potential N-glycosylation sites at N223, 
N229, and N408; one potential cAMP- and cGMP-dependent protein kinase phosphorylation site at 
S486; fifteen potential casein kinase II phosphorylation sites at S57, S100, T101, Tl 16, S135, S253, 
T349, S370, T387, S426, T434, S489, S505, S520, and.T526; one potential N-myristoylation site at 
G54; and nine potential protein kinase C phosphorylation sites at T15, S25, S57, SI 00, S123, S247, 

. 2 .7 . 
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S364, S370, and S505. SEQ ID NO:22 is 99 amino acid residues in length. The sequence of SEQ ID 
NO:22 from about amino acid residue 71 to about amino acid residue 81 resembles one of the 
fingerprint regions of the RH1 and RH2 opsins, a family of G protein coupled receptors that mediate 
vision (Zuker et al. (1985) Cell 40:851-858; Cowman et ai. (1986) Cell 44:705-710). SEQ IDNO:22 
5 also has one potential N-myristoylation site at G24, and two potential protein kinase C 

phosphorylation sites at S13 and S89. SEQ ID NO:23 is 493 amino acid residues in length and shows 
about 44% sequence identity from about amino: acid residue 277 to about amino acid residue 487 with 
an angiopoietin-like factor from the human cornea, CDT6 (g2765527). Angiopoietin 1 and 
angiopoietin 2 function as a natural ligand and a natural inhibitor, respectively, for TIE2, a receptor 

10 critical in angiogenesis during embryonic development, tumor growth, and tumor metastasis. The 
sequences encompassing amino acid residues 305 to 343, 346 to 355, 365 to 402, 41 1 to 424, and 428 
to 458 of SEQ ID NO:23 resemble the carboxy-termina! domain signatures of fibrinogen beta and 
gamma chains from BLOCKS analysis. SEQ ID NO:23 also exhibits one potential signal peptide 
region encompassing amino acid residues Ml to G22 when analyzed using a HMM-based signal 

15 peptide analysis tool. In addition, SEQ IDNO:23 shows two potential N-glycosylation sites at N 164 
and N 192; one potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at SI 27, 
six potential casein kinase II phosphorylation sites at S34, S209, T238, S266, T368, and T417; four 
potential N-myristoyJation sites at G12, G18, G22, arid G29; eight potential protein kinase C 
phosphorylation sites at S34, S209, T268, T299, T335, S373, S383, and S477; and three potential 

20 tyrosine kinase phosphorylation sites at Yl 83, Y392, and Y467. 

VII Homology Searching for Matrix-Remodeling Renes and the Proteins Encoded by the 
Genes 

Polynucleotide sequences, SEQ ID NOs:l-20, and polypeptide sequences, SEQ ID NOs: 21- 
23, were queried against databases derived from sources such as GenBank and SwissProt. These 

25 databases, which contain previously identified and annotated sequences, were searched for regions of 
similarity using Basic Local Alignment Search Tool (BLAST; Altschul (1990) supra ) and Smith- 
Waterman alignment (Smith et al. (1992) Protein Engineering 5:35-5 1). BLAST searched for matches 
and reported only those that satisfied the probability thresholds of 10* 25 or less for nucleotide 
sequences and 10* 8 or less for polypeptide sequences. 

30 The polypeptide sequences were also analyzed, for known motif patterns using MOTIFS, 

SPSCAN, BLIMPS, and Hidden Markov Model (HfMM>based protocols. MOTIFS (Genetics 
Computer Group, Madison WI) searches polypeptide sequences for patterns that match those defined 
in the Prosite Dictionary of Protein Sites and Patterns (Bairoch et al. supra ), and displays the patterns 
found and their corresponding literature abstracts. SPSCAN (Genetics Computer Group) searches for 

28 
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potential signal peptide sequences using a weighted matrix method (Nielsen et al. (1997) Prot Eng 
1 0: 1 -6). Hits with a score of 5 or greater were considered. BLIMPS uses a weighted matrix analysis 
algorithm to search for sequence similarity between the polypeptide sequences and those contained in 
BLOCKS, a database consisting of short amino acid segments, or blocks, of 3-60 amino acids in 
5 length, compiled from the PROSITE database (Henikoffet al. supra : Bairoch et al. supra ), and those in 
PRINTS, a protein fingerprint database based on non-redundant sequences obtained from sources such 
as SwissProt, GenBank, PIR, and NRL-3D (Attwood et al. (1997) J Chem Inf Comput Sci 
37:4 1 7-424). For the purposes of the present invention, the BLIMPS searches reported matches with a 
cutoff score of 1 000 or greater and a cutoff probability value of 1 .0 x 1 0\ HMM-based protocols 
10 were based on a probabilistic approach and searched for consensus primary structures of gene families 
in the protein sequences (Eddy, supra; Son nhammer, supra). More than 500 known protein families 
with cutoff scores ranging from 10 to 50 bits were selected for use in this invention. 
VTII Labeling and Use of Individual Hybridization Probes 

Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 software 
15 (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 //Ci of [y- 32 P] 
adenosine triphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase (NEN Life 
Science Products, Boston MA). The labeled oligonucleotides are substantially purified using a 
SEPHADEX G-25 superfine resin column (Amersham Pharmacia Biotech). An aliquot containing 10 7 
counts per minute of the labeled probe is used in a typical membrane-based hybridization analysis of 
20 human genomic DNA digested with one of the following endonucleases: Ase 1, Bgl II, Eco RI, Pst I, 
Xba 1 , or Pvu II (NEN Life Science Products). 

The DNA from each digest is fractionated on a 0.7 percent agarose gel and transferred to 
nylon membranes (NYTRAN PLUS, Schleicher & Schuell, Durham NH). Hybridization is carried out 
under the following conditions: 5x SCC/0.1% SDS at 60° C for about 6 hours, subsequent washes are 
25 performed at higher stringency with buffers, such as lx SCC/0. 1% SDS at 45° C, then 0. lxSCC. After 
XOMAT AR film (Eastman Kodak, Rochester;NY);is exposed to the blots for several hours, 
hybridization patterns are compared. 
IX Production of Specific Antibodies 

SEQ ID NO:20, 2 1 , or 23 substantially purified using polyacrylamide gel electrophoresis 
30 (Harrington ( 1 990) Methods Enzymol 1 82 :488-495X;or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the amino acid sequence is analyzed using LASERGENE software (DNASTAR, 
Madison WI) to determine regions of high immuriogenicity, and a corresponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 

29 
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selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
described in the art. Typically, oligopeptides ^residues in length are synthesized using an ABI 431 A 
peptide synthesizer (PE Biosystems) using Fmoc-chemistry and coupled to KLH (Sigma-Aldrich, St 
Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester to increase 
5 immunogenicity. Rabbits are immunized with the oligbpeptide-KLH complex in complete Freund's 
adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to 
plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio- 
iodinated goat anti-rabbit IgG. 
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What is claimed is: 

1 . A substantially purified polynucleotide'comprising a gene that is coexpressed with one or 
more known matrix-remodeling genes in a plurality of biological samples, wherein each known 
matrix-remodeling gene is selected from the group consisting of osteonectin, chondroitin/dermatan 
sulfate proteoglycans, collagen I, II, II, and IV,' connective tissue growth factor, fibrillin, fibronectins, 
fibronectin receptor, fibulin 1, heparan sulfate proteoglycan, extracellular matrix protein, insulin-like 
growth factor 1, insulin-like growth factor binding protein, Iaminin, lumican, matrix Gla protein, 
matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 1, 2, and 3. 

2. The polynucleotide of claim 1, comprising a polynucleotide sequence selected from the 
group consisting of: 

(a) a polynucleotide sequence selected from the group consisting of SEQ ID NOs: 1 - 20; 

(b) a polynucleotide sequence which encodes the polypeptide sequence of SEQ ID NO: 21, 
22, or 23; 

(c) a polynucleotide sequence having at least 70% identity to the polynucleotide sequence of 

(a) or(b); 

(d) a polynucleotide sequence comprising at least 18 sequential nucleotides of the 
polynucleotide sequence of (a), (b), or (c); , ■ 

(e) a polynucleotide which hybridizes under stringent conditions to the polynucleotide of (a), 

(b) ,(c),or(d);and 

(f) a polynucleotide sequence which is complementary to the polynucleotide sequence of (a), 
(b), (c),(d), or(e). 

3. A substantially purified polypeptide comprising the gene product of a gene that is 
coexpressed with one or more known matrix-remodeling genes in a plurality of biological samples, 
wherein each known matrix- remodeling gene is selected from the group consisting of osteonectin, 
chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, 
fibrillin, fibronectins, fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor I, insulin-like growth factor binding protein, Iaminin, 
lumican, matrix Gla protein, matrix metalloproteases/arid tissue inhibitors of matrix metalloproteinase 
1,2, and 3. T ; ' 

4. The polypeptide of claim 3, comprising a polypeptide sequence selected from the group 
consisting of: 

(a) the polypeptide sequence of SEQ ID Nd:21, 22, or 23; 

(b) a polypeptide sequence having at least 85% identity to the polypeptide sequence of (a); 

and 
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(c) a polypeptide sequence comprising at least 6 sequential amino acids of the polypeptide 
sequence of (a) or (b). 

5. An expression vector comprising the polynucleotide of claim 2. 

6. A host cell comprising the expression vector of claim 5. 

5 7. A pharmaceutical composition comprising the polynucleotide of claim 2 or the polypeptide 

of claim 3 in conjunction with a suitable pharmaceutical carrier. 

8. An antibody which specifically binds to the polypeptide of claim 4. 

9. A method for diagnosing a disease or condition associated with the altered expression of a 
gene that is coexpressed with one or more known matrix-remodeling genes, wherein each known 

10 matrix-remodeling gene is selected from the group consisting of osteonectin, chondroitin/dermatan 
sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, fibrillin, fibronectins, 
fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular matrix protein, insulin-like 
growth factor 1, insulin-like growth factor binding protein, laminin, lumican, matrix Gla protein, 
matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 1, 2, and 3, the method 

15 comprising the steps of: 

(a) providing a sample comprising one of more of said coexpressed genes; 

(b) hybridizing the polynucleotide of claim 2(F) to said coexpressed genes under conditions 
effective to form one or more hybridization complexes; and 

(c) detecting the hybridization complexes, wherein the altered level of hybridization 
20 complexes compared with the level of hybridization complexes of a nondiseased sample 

correlates with the presence of the disease or condition. 

1 0. A method for treating or preventing a disease associated with the altered expression of a 
gene that is coexpressed with one or more known matrix-remodeling genes in a subject in need, 
wherein each known matrix-remodeling gene is selected from the group consisting of osteonectin, 

25 chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, 
fibrillin, fibronectins, fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor 1, insulin-like growth factor binding protein, laminin, 
lumican, matrix Gla protein, matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 
1, 2, and 3, the method comprising the step of administering to said subject in need the pharmaceutical 

30 composition of claim 7 in an amount effective for treating or preventing said disease. 

1 1 . A method for treating or preventing a disease associated with the altered expression of a 
gene that is coexpressed with one or more known matrix-remodeling genes in a subject in need, 
wherein each known matrix-remodeling gene is selected from the group consisting of osteonectin, 
chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, 
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fibrillin, fibronectins, fibronectin receptor, fibulih 1 , heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor 1, insulin-like growth factor binding protein, laminin, 
lumican, matrix Gla protein, matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 
1, 2, and 3, the method comprising the step of administering to said subject in need the antibody of 
5 claim 8 in an amount effective for treating or preventing said disease. 

t 12. A method for treating or preventing a disease associated with the altered expression of a 

gene that is coexpressed with one or more known matrix-remodeling genes in a subject in need, 
wherein each known matrix-remodeling gene is selected from the group consisting of osteonectin, 
chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, 
10 fibrillin, fibronectins, fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor 1, insulin-like growth factor binding protein, laminin, 
lumican, matrix Gla protein, matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 
1, 2, and 3, the method comprising the step of administering to said subject in need the polynucleotide 
sequence of claim 2(F) in an amount effective for treating or preventing said disease. 
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<110> INCYTE PHARMACEUTICALS, INC. 
WALKER, Michael G. 
VOLKMUTH, Wayne 
KLINGLER, Tod M. 

<120> MATRIX -REMODELING GENES 

<130> PB-0004 PCT 

<l4 0> To Be Assigned 
<141> Herewith 

<150> 09/169,289 
<151> 1998-10-09 

<160> 23 

<170> PERL Program 



<210> 1 !-.■;■: 

<211> 1447 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> unsure . 
<222> 1380 

<223> a or g or c or t, unknown, or other 
<220> 

<221> misc_feature 

<223> Incyte ID No.: 606132CB1 

<400> 1 

cctggaacca gaaggagacc tacctgcaca tcatgaagaa cgaggaggag gtggtgatct 60 
tgttcgcgca ggtgggcgac cgcagcatca tgcaaagcca gagcctgatg ctggagctgc 120 

gagagcagga ccaggtgtgg gtacgcctct acaagggcga acgtgagaac gccatcttca 180 

gcgaggagct ggacacctac atcaccttca gtggctacct ggtcaagcac gccaccgagc 240 
cctagctggc cggccacctc ctttcctctc gccaccttcc acccctgcgc tgtgctgacc 300 
ccaccgcctc ttccccgatc cctggactcc gactccctgg ctttggcatt cagtgagacg 360 

ccctgcacac acagaaagcc aaagcgatcg gtgctcccag atcccgcagc ctctggagag 420 

agctgacggc agatgaaatc accagggcgg ggcacccgcg, agaaccctct gggaccttcc 480 

gcggccctct ctgcacacat cctcaagtga ccccgcacgg cgagacgcgg gtggcggcag 540 

ggcgtcccag ggtgcggcac cgcggctcca gtccttggaa ataattaggc aaattctaaa 600 

ggtctcaaaa ggagcaaagt aaaccgtgga ggacaaa/gaa " aagggttgtt atttttgtct 660 

ttccagccag cctgctggct cccaagagag aggccttttc agttgagact ctgcttaaga 720 

gaagatccaa agttaaagct ctggggtcag gggaggggce gggggcagga aactacctct 780 

ggcttaattc ttttaagcca cgtaggaact ttcttgaggg ataggtggac cctgacatcc 840 

ctgtggcctt gcccaagggc tctgctggtc tttctgagtc acagctgcga ggtgatgggg 900 

gctggggccc caggcgtcag ctcccagagg gacagctgag ccccctgcct tggctccagg 960 

ttggtagaag cagccgaagg gctcctgaca gtggccaggg acccctgggt cccccaggcc 1020 

tgcagatgtt tctatgaggg gcagagctcc tggtacatcc atgtgtggct ctgctccacc 1080 

cctgtgccac cccagagccc tggggggtgg tctccatgcc tgccaccctg gcatcggctt 1140 

tctgtgccgc ctcccacaca aatcagcccc agaaggcccc ggggccttgg cttctgtttt 1200 

ttataaaaca cctcaagcag cactgcagtc tcccatctcc tcgtgggcta agcatcaccg 1260 

cttccacgtg tgttgtgttg gttggcagca aggctgatcc agaccccttc tgcccccact 1320 

gcgctcatcc aggcctctga ccagtagcct gagaggggct ttttctaggc ttcagagcan 1380 

gggagagctg gacggggtag acagtccgct tgtctgttct aagctctgtg agctcagtct 1440 
gagacaa 1447 
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<210> 2 

<211> 2481 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No. : 627722CB1 

<400> 2 

ctagcaagca ggtaaacgag ctttgtacaa 
ctgtgtgttg ctagagcaga ggctgattaa 
cctggaaaat aatgaattgg gtaaggaaca 
gcacattaca acaaagagct ggcagctcct 
aaacttgtca gtcaactcat gccagcagcc 
tacatgtgtc tgtctggcct gatctgtgca 
aatttctcta tttctccact ggtgcaaaga 
ccccgctcct ctcccccagg aggctccttg 
tctgactgtc cttgacttct agaatggaag 
ccatcacaga taaaagaaaa atacaggaag 
aagacaaact aaagcaccag catttgaaga 
atggaatcag cagcggaaaa gaacaggaag 
accagatcca ggttctagaa caaagtatcc 
aaaaagctga actgcaaatc tcaacgaagg 
ttgagcggac aacagaagac attataagat 
aagagtcaat tgaggacatc tatgctaata 
ctaggttaag gaaggagata aatgaagaaa 
tatatgccat ggaaattaaa gttgaaaaag 
cttcaatacc tctgccatca gatgacttta 
ggcaaaagtc agtgtatgca gtaagttcta 
gcctggcacc agttgaagta gaggaacttc 
ccccaacaga gtatcatgag cctgtatatg 
agagagaaac ggtgacccct ggaccaaact 
gactgggtat tggtgtaaat gaatccatac 
ggggaaacaa cttcaatcac atcagtccca 
ttcaacaagc agaagagaag cttcacaccc 
aatcgaatgt catgcaggac aaagatgcac 
agacaatatt tgggaaatct gaacaccaga 
aagatgtcag atataatatc gttcattccc 
tgacaatgat tttcatgggg tatcagcagg 
tgacaggata tgatgggatc atccatgctg 
aggatgaagg agaagcagag aaaccgtcct 
accagccagc caaaccaaca ccacttccta 
acacaaatca taaatccccc cacaaaaatt 
taggcagccc tgtccaccat tccccatttg 
atccatcctt aacagcttta aggatgagaa 
agttgtacca cctatataaa catcctttga 
tcttctggat attttgttta ttttttctga 
tattaagcca tgtgaataag tagtagtcat 
aaaacaaatg tgtaactttt ccagttactt 
ttttattcta ttgataccaa agcatttcta 
tatttaaaat aaaaaaaaaa a 



acacacacag accaacacat ccggggatgg 60 
acactcagtg tgttggctct ctgtgccact 120 
gttaataaga aaatgtgcct tgctaactgt 180 
gaaggaaaag ggcttgtgcc gctgccgttc 240 
tcagcgtctg cctccccagc acaccctcat 300 
tctgctcgga gacgctcctg acaagtcggg 360 
gcggatttct ccctgcttct cttctgtcac 420 
atttatggta gctttggact tgcttccccg 4 80 
aagctgagct ggtgaaggga agactccagg 540 
aaatctcaca gaagcgtctg aaaatagagg 600 
aaaaggcctt gagggagaaa tggcttctag 660 
agatgaagaa gcaaaatcaa caagaccagc 720 
tcaggcttga gaaagagatc caagatcttg 780 
aagaggccat tttaaagaaa ctaaagtcaa 84 0 
ctgtgaaagt ggaaagagaa gaaagagcag 900 
tccctgacct tccaaagtcc tacatacctt 960 
aagaagatga tgaacaaaat aggaaagctt 1020 
acttgaagac tggagaaagt acagttctgt 1080 
aaggtacagg aataaaagtt tatgatgatg 1140 
atcacagtgc agcatacaat ggcaccgatg 1200 
taagacaagc ctcagagaga aactctaaat 1260 
ccaatccctt ttacaggcct acaaccccac 1320 
ttcaagaaag gataaagatt aaaactaatg 1380 
acaatatggg caatggtctt tcagaggaaa 14 4 0 
ttccgccagt gcctcatccc cgatcagtga 1500 
cgcaaaaaag gctaatgact ccttgggaag 1560 
cctctccaaa gccaaggctg agccccagag 1620 
attcttcacc. cacttgtcag gaggacgagg, 1680 
tgcctccaga cataaatgat acagaaccgg 1740 
cagaagacag tgaagaagat aagaagtttc 1800 
agctggttgt gattgatgat gaggaggagg 1860 
accaccccat agctccccat agtcaggtgt 1920 
gaaaaagatc agaagctagt cctcatgaaa 1980 
ccatatctct gaaagagcaa gaagaaagct 2040 
atgctcagac aactggagat gggactgagg 2100 
tggcaaagct gggaaaaaag gtgatctaag 2160 
agaagaaact. aagaagcatt tgcaaatttc 2220 
agtccaaaaa attatcatta cagtgtacca 2280 
tatttgtgaa aaattcccaa aaagctgggg 2340 
gacacgattc agtgggggaa aaccagcatt 2400 
ataagagctt gttaaattta agaataaagt 24 60 

2481 



<210> 3 

<211> 2987 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> unsure 
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<222> 2955 

<223> a or g or c or t, unknown, or other. 



<220> 

<221> misc_feature 

<223> Incyte ID No.: 639644CB1 

<400> 3 

agaaaaaaag aaaaaagaaa aaaactaagg cagcagctct taataaataa cacctggagc 60 
agaatcggta aactgctttc acgttggctt ttgcagaagt ggcaatgcat tgaggataca 120 
tctggcaagc ttcgaattca caagtgtaaa ggacccagtg acctgctcac agtccggcag 180 
agcacgcgga acctctacgc tcgcggcttc catgacaaag acaaagagtg cagttgtagg 240 
gagtctggtt accgtgccag cagaagccaa agaaagagtc aacggcaatt cttgagaaac 300 
caggggactc caaagtacaa gcccagattt gtccatactc ggcagacacg ttccttgtcc 360 
gtcgaatttg aaggtgaaat atatgacata aatctggaag aagaagaaga attgcaagtg 420 
ttgcaaccaa gaaacattgc taagcgtcat gatgaaggcc acaaggggcc aagagatctc 480 
caggcttcca gtggtggcaa caggggcagg atgctggcag atagcagcaa cgccgtgggc 540 
ccacctacca ctgtccgagt gacacacaag tgttttattc ttcccaatga ctctatccat 600 
tgtgagagag aactgtacca atcggccaga gcgtggaagg accataaggc atacattgac 660 
aaagagattg aagctctgca agataaaatt aagaatttaa gagaagtgag aggacatctg 720 
aagagaagga agcctgagga atgtagctgc agtaaacaaa gctattacaa taaagagaaa 780 
ggtgtaaaaa agcaagagaa attaaagagc catcttcacc cattcaagga ggctgctcag 840 
gaagtagata gcaaactgca acttttcaag gagaacaacc gtaggaggaa gaaggagagg 900 
aaggagaaga gacggcagag gaagggggaa gagtgcagcc tgcctggcct cacttgcttc 960 
acgcatgaca acaaccactg gcagacagcc ccgttctgga acctgggatc tttctgtgct 1020 
tgcacgagtt ctaacaataa cacctactgg tgtttgcg.ta cagttaatga gacgcataat 1080 
tttcttttct gtgagtttgc tactggcttt ttggagtatt ttgatatgaa tacagatcct 1140 
tatcagctca caaatacagt gcacacggta gaacgaggca ttttgaatca gctacacgta 1200 
caactaatgg agctcagaag ctgtcaagga tataagcagt gcaacccaag acctaagaat 1260 
cttgatgttg gaaataaaga tggaggaagc tatgacctac acagaggaca gttatgggat 1320 
ggatgggaag gttaatcagc cccgtctcac tgcagacatc aactggcaag gcctagagga 1380 
gctacacagt gtgaatgaaa acatctatga gtacagacaa aactacagac ttagtctggt 14 40 
ggactggact aattacttga aggatttaga tagagtattt gcactgctga agagtcacta 1500 
tgagcaaaat aaaacaaata agactcaaac tgctcaaagt gacgggttct tggttgtctc 1560 
tgctgagcac gctgtgtcaa tggagatggc ctctgctgac tcagatgaag acccaaggca 1620 
taaggttggg aaaacacctc atttgacctt gccagctgac. cttcaaaccc tgcatttgaa 1680 
ccgaccaaca ttaagtccag agagtaaact tgaatggaat aacgacattc cagaagttaa 1740 
tcatttgaat tctgaacact ggagaaaaac cgaaaaatgg acggggcatg aagagactaa 1800 
tcatctggaa accgatttca gtggcgatgg catgacagag ctagagctcg ggcccagccc 1860 
caggctgcag cccattcgca ggcacccgaa agaacttccc cagtatggtg gtcctggaaa 1920 
ggacattttt gaagatcaac tatatcttcc tgtgcattcc gatggaattt cagttcatca 1980 
gatgttcacc atggccaccg cagaacaccg aagtaattcc agcatagcgg ggaagatgtt 204 0 
gaccaaggtg gagaagaatc acgaaaagga gaagtcacag cacctagaag gcagcgcctc 2100 
ctcttcactc tcctctgatt agatgaaact gttaccttac cctaaacaca gtatttcttt 2160 
ttaacttttt tatttgtaaa ctaataaagg taatcacagc caccaacatt ccaagctacc 2220 
ctgggtacct ttgtgcagta gaagctagtg agcatgtgag caagcggtgt gcacacggag 2280 
actcatcgtt ataatttact atctgccaag agtagaaaga aaggctgggg atatttgggt 2340 
tggcttggtt ttgatttttt gcttgtttgt ttgttttgta ctaaaacagt attatctttt 2400 
gaatatcgta gggacataag tatatacatg ttatccaatc aagatggcta gaatggtgcc 24 60 
tttctgagtg tctaaaactt gacacccctg gtaaatct'tt caacacactt ccactgcctg 2520 
cgtaatgaag ttttgattca tttttaacca ctggaa.tttt tcaatgccgt cattttcagt 2580 
tagatgattt tgcactttga gattaaaatg ccatgtctat ttgattagtc ttattttttt 2640 
atttttacag gcttatcagt ctcactgttg gctgtcattg tgacaaagtc aaataaaccc 2700 
ccaaggacga cacacagtat ggatcacata ttgtttgaca ttaagctttt gccagaaaat 2760 
gttgcatgtg ttttacctcg acttgctaaa atcgattagc agaaaggcat ggctaataat 2820 
gttggtggtg aaaataaata aataagtaaa caaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2880 
aaaaaaaaaa aaaaaaaaaa aaaaagcaaa aaaagctgcc gccacagtta gatgaagaag 2940 
catgaggatc cgagngggtc gcctctttga gtggtgaggg agtcgcg 2987 

<210> 4 
<211> 2915 
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<212> dna J;; 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No.: 1362659CB1 

<400> 4 

gaggpaagaa ttcggcacga gggacatttt 
acccggcaca ctcccccttc ctccagcccc 
gtaaagccct gtgccttttt ttcccctgaa 
ctctcaggag ttggggactt tgctaggaga 
ggagccacgt ttgcaggagc tccatttgta 
ccagttcatg tccctgactc tcacctccca 
gagtgatgag agtcaagaag aggggatgta 
atgacctgat ccacctagcc ttttcttctg 
gctgtccaca gtaggaaaca taaagaaaca 
atccatcatc gtaggaaata ggaaagcaaa 
ttcaataatt ctttttttgt gtcttaaata 
gtgttaggtt tcacatatat attcatcaac 
tgtattacct cagatcattt taaatagcaa 
cattcctgtt cacaaaaggt tctcatggtg 
tactttttaa aagtcaatgg ttttttttct 
atatagaaat atatgcaaaa attatagttt 
cagccatatg tattttgttt aaaggattta 
agggagcaca taaccagctg tttggcatga 
taaaaccaat acaccatact ttctttctgc 
aattgttggg ttctagactt ttttaatata 
aagtgtctat gtgcatatgt tttttatata 
ctggcagtgg gtaaatatgg cataagttaa 
tttgaaaagg gtctgatggg gagaaggaga 
acctagaaaa acgggtagta aactgtggat 
ctgtcaggaa atgaatcttc cccccaaccc 
cctgactagt cattaggatc aggggcctct 
gcagagtggt ataaaagaca cgaatatctc 
ttgcattttt catggttttt atttcctgtt 
gtgcaaggat cttatttgtg atgccttccc 
aatggggaca gaattctaaa tggataaaac 
ttaaggctag atccttccca tagtatcatc 
aggggttaag agagagatca cctagaaatc 
tgagtttctt cttccccttg agcttcagag 
ttacctcact gctgaaaacc cagaggggcg 
aaatgcatcc cttcctttct ttcctgcttc 
accatcacag tatgcagaga cttcctcacc 
tggtgagggt gggcacttat aaatgcctgc 
gaatagacca gacgcccttt cacttagttc 
ctgttaggcc tgctgttccc tttgctcttg 
ataactgaat tggcctttgg ttcatgtttt 
tatgccatat atatgtgcca acaaatctat 
ataggaattt tgagtttctt cttcttttag 
tacaaaaaag aaaaacaaaa gattgactat 
gtgatatcaa agcaacgtat accccagtcc 
ttaacagtgc acccaatcta tatttgcatt 
tttgcatgta tttatatggt tcttagggaa 
tcaaatgtgt tgttccactg agaccagaag 
ttggagccaa taaagctttt tgctgatgaa 
aaaaactcat gcccacttgt aaaaaaaaaa 



gccaacttaa acgagaaaaa gaccccccgc 60 
gcttcagcca catgctccag ctgctgccca 120 
tactgcccaa agcatcccct tcccatctgc 180 
ttttttaagt gttccttact gggacaacgt 24 0 
tccctgctgg tgttgacttc tgtgtagggg 300 
ttagataaat gaagcccacc cccctttcta 360 
tgaacggcca aattcccatg tgagaggaag 420 
gatctgtcct ccctcacccc tttcacctga 480 
atgtccccta catatcccca tgactacata 540 
tttgattttg gttttgtaaa acgtacatgc 600 
ctcatagggg aaaaaaacag ctcacccaag 660 
tattttagaa gatttaattc tatcaaatct 720 
gccaataacg agctttgaag gctattttac 780 
cctgacaggt tacccttgag ggcttgtgtc 840 
tgtgttctag tttccataat aggagagaaa 900 
tctttagatc agaaactgat atttttgggt 960 
aaataaag.tg ccgtcatgta gccctgtgga 1020 
caggtgactt agtatatttg taattggttt 1080 
aaacagecat ctttatactt agggaagaaa 1140 
aattttgttg atatggaatt aggtaagttt 1200 
agttttttct attcagtttc actgatccaa 1260 
taacactttt ccccaaaatg gtgctttgga 1320 
acgtatcatc ctagcttcct ctcttaataa 1380 
agtcaggaaa. acacccagca agggacacag 14 40 
ccaccatgca gatggataga cagaatcttt 1500 
gttggatttg tgtttcttga agaatagctg 1560 
ctggtctata aggatactct gatttggggt 1620 
ccccctggag ttttccatta gtgagttttt 1680 
tcccctagaa agattttgtg caatatatta 1740 
aatggctggt tctagccctg agtgacagtc 1800 
tgtcctctgg aatgactctc ctgtccctaa 18 60 
cctctggaca cttgtgggtt ctttagggtt 1920 
aggagagttg gcatggttaa atctgaatgg 1980 
tggcacactc gcttgtgtgg aaaagcctct 2040 
ctttgcctta caattgaagc agcccgtggt 2100 
tttcatat ; ct agggaccacc cccgatgcat 2160 
tatt'gttaag ccattccagc ctcttcctct 2220 
agtgccagtc cttttgcctt cccaaccctg 2280 
attaggagag atggaaggag atgagctccc 2340 
ctccccatat g.tatatatgc catatgtgaa 2400 
ctacgt.tg^tt cttttcaaat tagcacgcag 24 60 
taactagtat aacaagcact ggtatttttg 2520 
tgtggtetgc atgacataaa caaacaaatg 2580 
agtgtgtgtt gccataattt gcaattcagc 2640 
ttgatattat ttaagctcta tgtacaaggt 2700 
aaaaaatgct ataaactgca aatctgaaat 2760 
aagaagagga gttttaaaag ggataatttg 2820 
cagaaaecai tactgctgtg cactgagaat 2880 
aaagg "" 2915 
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<211> 1826 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 1446685CB1 

<400> 5 

gaaagccgca gcctcagtcc cgccgccgcc 
cccgaccggc ccgcggcagc ctgcgccgcg 
ctgtccccaa gtctcccact cccaagtcgc 
tggggaaact cggagggacc ctggcccgga 
aggaggaggg aatgaacgcc atcaacctgc 
ccgaggacac gatgctggag gagaatgagg 
gtgaccccaa gcttcaagaa ctgatgaagg 
ttggagaaag aatcattgtg aaagacctag 
agaagctttt cgagaaactg gagagtgaga 
agattgctca gaagcaaaaa ctgcagactg 
ttcctcccag gagcatcaag tggaatgtgg 
tcttacacct gctcgttgct ctgtctcagt 
atgtttccat ccaagtggtt gtggtccaga 
tccaagagga aataactggt aacacagagg 
ttgacacctt gttcgaccat gccccagaca 
ctttcgtgaa caagcacctg aataaactga 
ttgcagatgg ggtgtacctg gtgctgctca 
tgcacagctt cttcctgacc ccggacagct 
cctttgagct catgcaagat ggagggttgg 
tcaactgtga cctgaaatct acactacgag 
acgtggagtg aggggctgcc ctgggcccac 
ctggaccctc ctccgaactg ccttaccctg 
cacaagtcca gctgcaaccc agagatagtg 
aactcagtgg gctgacccat ccctcccagg 
ggaaggttgt tcccttcccg gtgccaggtc 
ttaggcaaaa gagtccccac aagatgaaaa 
taactgtgtg tcaggcccca cactaagtgc 
ttcaggactc ccattgacgt aggtgtttca 
ttggaggtta aatgacttgc cagaagttgg 
ccttctccct aaaggtaacc actattctga 
tagctaagta tgcattcctc aatagt 



cgctgcgtcc gcccagcgcc agctccgcgt 60 
ccatggccac ctccccgcag aagtcgcctt 120 
ccccgtcccg caagaaagat gattccttct 180 
ggaagaaagc caaggaggtg tccgagctgc 240 
ccctcagccc aattcccttt gagctggacc 300 
tgcgaacaat ggtggatcca aactcacgca 360 
tattaattga ctggattaat gatgtgttgg 420 
ctgaagattt gtatgatgga caagtcctgc 480 
agctaaatgt ggctgaggtc acccagtcag 540 
tcctggagaa gatcaatgaa accctgaaac 600 
attctgttca tgccaagagc ctggtggcca 660 
atttccgcgc accaattcga ctcccagacc 720 
aacgagaagg aatcctccag tctcggcaaa 780 
ctctttccgg gaggcatgaa cgtgatgcct 84 0 
agctgaatgt ggtgaaaaag acactcatca 900 
acctggaggt cacagaactg gaaacccagt 960 
tggggctcct ggagggctac tttgtgcccc 1020 
ttgaacagaa ggtcttgaat gtctcctttg 1080 
aaaagccaaa accgcggcca gaagacatag 1140 
tgttgtacaa cctcttcacc aagtaccgta 1200 
cactgcccaa gagttcttgc tgttggcgta 1260 
cttattcctg tctcttgcac tgtgctctcc 1320 
gaaactgaaa ttaggaagga aatcatcaat 1380 
cgctggggac caacctagca atgaaggttg 14 40 
cagatttccc tccatgattt gggaaccagg 1500 
taaagatcct agttaccatt caaaggatgc 1560 
tctgctctga tatactcaag gccattaatc 1620 
ttcccc.tttt acagatgagg aaactaaggc 1680 
aatttttttc ctctttgaac ataacctctc 1740 
gtccaatcat caaggttttg cttttctttt 1800 

1826 



<210> 6 

<211> 1439 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 1556751CB1 



<400> 6 

gagtatccct 

ttccttgaag 

agaagaaaaa 

caaatgttaa 

ctcagggtgg 

gttcctcccc 

gttggcaaaa 

gtccccaggc 

gccagctggg 

ggaaacgggg 

gaatctccac 



tgtttaatca 

agtttagccc' 

aagagacaaa 

ttttcctaga 

cttctgcgtc 

gggactccag 

cgcagggccg 

ctcccagcgc 

ctttttaaca 

cttgccagag 

atcattgtct 



cttttgtggt 
tggctcactt 
ttacccagaa 
aaatccttca 
cccgccgcca 
aatttctctc 
gctcccaaaa 
aaacttaaag 
acctagagac 
acactcacag 
ttcttgtgcc 



taaaagagac 

ttcactctat 

acccctccptL 

gacctgaaga 

ggccccagac 

ctcaaaggaa 

acccca.tgtg 

agacagg^gct 

tttccggagc 

tttccttcat 

ttttccttgg 



ctttgggtca 
ttcttctcct 
tccccacatg 
cgcaggaaaa 
tatggtcaca 
agaaaacagg 
tgtacgatta 
ttgctgaaaa 
tgcctggaac 
ggcctgtttt 
tgagcaacag 



gtctgcctca 60 
gtctcaagaa 120 
gaggccttgg 180 
gaatctggct 240 
gggccgtcct 300 
gcatgcgctt 360 
aaagttggcc 420 
ccaaacatgg 480 
agagcctgcg 54 0 
ggtcccctaa 600 
aaagggaagg 660 
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gttccaagcc tctaaaaatg tgctttgtga tcaggagtgc gctccaaacc aaatacgcgc 720 
gctgcccttt cgaggccagt gagctcagcc tccaaggctt taaagccaca tttcagcaag 780 
agaaagcgct gagagctcgc aggttcatta aagaaggcaa agcactggtt tctctcctta 840 
gaaaagtagg tttcttggct tgatgtagac tggcttgctt tgatttttag tgaagggaat 900 
gtacgtaaaa caaaataggg cttggctggt caaaggagac aagcaggatg gatggatgga 960 
tggatggatg gatgtatgga tgaatagata gatggtgttt gcatgtaaat tgcagagaaa 1020 
acaaaaccaa agctgattgg aaacaattaa ttgtgggtgt ctgaggggga aggtcgcagc 1080 
tttgggcagc tttgagaagc ggtacaagag ttctgtgcct gtgtgtccag ccctggagcc 1140 
agccagtgca tttattttaa gctcttagaa gcaactcctt . ggcccaggaa tgcgtgaccc 1200 
ctgagatggg tccacgcatc tctctacact tccttctctc cgtgggatac tggactcgtg 1260 
cctctgcgcc cattctcttc tcacgcatat ccatgagctt taatttcact ttctgatcac 1320 
ggtacgtcpa taaagccagt attacactta aatgaagtat tcttttttgt aatcgttttt 1380 
tttagaaggt aaacaaattt aataaagcta ccaataatga gaaaaaaaaa aaaaaaaaa 1439 



<210> 7 
<211> 3047 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No.: 1656953CB1 

<400> 7 • . , ; 

cgagacagag gaaatgtgtc tccctccaag gccccaaagc ctcagagaaa gggtgtttct 60 
ggttttgcct tagcaatgca tcggtctctg aggtgacact -ctggagcggt tgaagggcca 120 
caaggtgcag ggttaatact cttgccagtt ttgaaatata gatgctatgg ttcagattgt 180 
ttttaataga aaactaaagg ggcaggggaa gtgaaaggaa agatggaggt tttgtgcggc 240 
tcgatggggc atttggaact tctttttaaa gtcatctcat ggtctccagt tttcagttgg 300 
aactctggtg tttaacactt aagggagaca aaggctgtgt ccatttggca aaacttcctt 360 
ggccacgaga ctctaggtga tgtgtgaagc tgggcagtct gtggtgtgga gagcagccat 420 
ctgtctggcc attcagagga ttctaaagac atggctggat gcgctgctga ccaacatcag 480 
cacttaaata aatgcaaatg caacatttct ccctctgggc cttgaaaatc cttgccctta 540 
tcatttgggg tgaaggagac atttctgtcc ttggcttccc acagccccaa cgcagtctgt 600 
gtatgattcc tgggatccaa cgagccctcc tattttcaca gtgttctgat tgctctcaca 660 
gcccaggccc atcgtctgtt ctctgaatgc agccctgttc tcaacaacag ggaggtcatg 720 
gaacccctct gtggaaccca caaggggaga aatgggtgat aaagaatcca gttcctcaaa 780 
accttccctg gcaggctggg tccctctcct gctgggtggt gctttctctt gcacaccact 840 
cccaccacgg ggggagagcc agcaacccaa ccagacagct caggttgtgc atctgatgga 900 
aaccactggg ctcaaacacg tgctttattc tcctgtttat ttttgctgtt actttgaagc 960 
atggaaattc ttgtttgggg gatcttgggg ctacagtagt gggtaaacaa atgcccaccg 1020 
gccaagaggc cattaacaaa tcgtccttgt cctgaggggc cccagcttgc tcgggcgtgg 1080 
cacagtgggg aatccaaggg tcacagtatg gggagaggtg caccctgcca cctgctaact 114 0 
tctcgctaga cacagtgttt ctgcccaggt gacctgttca gcagcagaac aagccagggc 1200 
catggggacg ggggaagttt tcacttggag atggacacca agacaatgaa gatttgttgt 1260 
ccaaataggt caataattct gggagactct tggaaaaaac tgaatatatt caggaccaac 1320 
tctctccctc ccctcatccc acatctcaaa gcagacaatg taaagagaga acatctcaca 1380 
cacccagctc gccatgccta ctcattcctg aatttcaggt gccatcactg ctctttcttt 14 4 0 
cttctttgtc atttgagaaa ggatgcagga ggacaattcc cacagataat ctgaggaatg 1500 
cagaaaaacc agggcaggac agttatcgac aatgcattag aacttggtga gcatcctctg 1560 
tagagggact ccacccctgc tcaacagctt ggcttccagg caagaccaac cacatctggt 1620 
ctctgccttc ggtggcccac acacctaagc gtcatcgtca ttgccatagc atcatgatgc 1680 
aacacatcta cgtgtagcac tacgacgtta tgtttgggta atgtggggat gaactgcatg 174 0 
aggctctgat taaggatgtg gggaagtggg ctgcggtcac tgtcggcctt gcaaggccac 1800 
ctggaggcct gtctgttagc cagtggtgga ggagcaaggc ttcaggaagg gccagccaca 1860 
tgccatcttc cctgcgatca ggcaaaaaag tggaattaaa aagtcaaacc tttatatgca 1920 
tgtgttatgt ccattttgca ggatgaactg agtttaaaag aatttttttt tctcttcaag 1980 
ttgctttgtc ttttccatcc tcatcacaag cccttgtttg agtgtcttat ccctgagcaa 2040 
tctttcgatg gatggagatg atcattaggt acttttgttt caacctttat tcctgtaaat 2100 
atttctgtga aaactaggag aacagagatg agatttgaca aaaaaaaatt gaattaaaaa 2160 
taacacagtc tttttaaaac taacatagga aagcct'ttcc tattatttct cttcttagct 2220 
tctccattgt ctaaatcagg aaaacaggaa aacacagc.tt tctagcagct gcaaaatggt 2280 
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ttaatgcccc ctacatattt ccatcacctt gaacaatagc tttagcttgg gaatctgaga 2340 

tatgatccca gaaaacatct gtctctactt cggctgcaaa. acccatggtt taaatctata 2400 

tggtttgtgc attttctcaa ctaaaaatag agatgataat ccgaattctc catatattca 24 60 

ctaatcaaag acactatttt catactagat tcctgagaca aatactcact gaagggcttg 2520 

tttaaaaata aattgtgttt tggtctgttc ttgtagataa tgcccttcta ttttaggtag 2580 

aagctctgga atccctttat tgtgctgttg ctcttatctg caaggtggca agcagttctt 2640 

ttcagcagat tttgcccact attcctctga gctgaagttc tttgcataga tttggcttaa 2700 

gcttgaatta gatccctgca aaggcttgct ctgtgatgtc agatgtaatt gtaaatgtca 27 60 

gtaatcactt catgaacgct aaatgagaat gtaagtattt ttaaatgtgt gtatttcaaa 2820 

tttgtttgac taattctgga attacaagat ttctatgcag gatttacctt catcctgtgc 2880 

atgtttccca aactgtgagg agggaaggct cagagatcga gcttctcctc tgagttctaa 2940 

caaaatggtg ctftgagggt cagcctttag gaaggtgcag ctttgttgtc ctttgagctt 3000 

tctgttatgt gcctatccta ataaactctt aaacacaaaa aaaaaaa 3047 



<210> 8 

<211> 3017 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 1662318CB1 

<400> 8 

cgcaaactca accctttcgg aaacaccttc ctcaacaggt tcatgtgtgc ccagctccct 60 
aatcaggtcc tggagagcat cagcatcatc gacaccccgg gtatcctgtc gggtgccaag 120 
cagagagtga gccgcggcta cgacttcccg gccgtgctgc gctggttcgc ggagcgcgtg 180 
gacctcatca tcctgctctt tgatgcgcac aagctggaga tctcggacga gttctcagag 240 
gccatcggcg cgttgcgggg ccatgaggac aagatccgcg tggtgctcaa caaggccgac 300 
atggtggaga cgcagcagct gatgcgcgtc tacggcgcgc tcatgtgggc gctgggcaag 360 
gtggtgggca cgcccgaggt gctgcgcgtc tacatcggct ccttctggtc ccagcccctc 420 
ctggtgcccg acaaccggcg cctcttcgag ctggaggagc aggacctctt ccgcgacatc 4 80 
cagggcctgc cccggcacgc agccttgcgc aagctcaacg acctggtgaa gagggcccgg 540 
ctggtgcgag ttcacgctta catcatcagc tacctgaaga aggagatgcc ctctgtgttt 600 
gggaaggaga acaagaagaa gcagctgatc ctcaaactgc ccgtcatctt tgcgaagatt 660 
cagctggaac atcacatctc ccctggggac tttcctgatt gccagaaaat gcaggagctg 720 
ctgatggcgc acgacttcac caagtttcac tcgctgaagc cgaagctgct ggaggcactg 780 
gacgagatgc tgacgcacga catcgccaag ctcatgcccc tgctgcggca ggaggagctg 840 
gagagcaccg aggtgggcgt gcaggggggc gcttttgagg gcacccacat gggcccgttt 900 
gtggagcggg gacctgacga ggccatggag gacggcgagg agggctcgga cgacgaggcc 960 
gagtgggtgg tgaccaagga caagtccaaa tacgacgaga tcttctacaa cctggcgcct 1020 
gccgacggca agctgagcgg ctccaaggcc aagacctgga, tggtggggac caagctcccc 1080 
aactcagtgc tggggcgcat ctggaagctc agcgatgtgg accgcgacgg catgctggat 1140 
gatgaagagt tcgcgctggc cagccacctc atcgaggcca agctggaagg ccacgggctg 1200 
cccgccaacc tgccccgtcg cctggtgcca ccctccaagc gacgccacaa gggctccgcc 1260 
gagtgagccg ggcccccctc ccatggccct gctgtggctc cccagctcca gtcggctgca 1320 
cgcacacccc tgctccggct cacacacgcc ctgcctgbcc tccctgccca gctgtaagga 1380 
ccgggggtct ccctcctcac taccgccaga caccccggtg gaagcattta gaggggacca 14 40 
cgggagggac aaggcttctc tgtccgccct tcacacctcc agcctcacgt tcacttaggc 1500 
acatcacaca cacactggca cacgcaggca tccatccatc cgtcattcat tcaaatattt 1560 
attgagcacc tactatgtgc ccagccctgt tctaggcact gggcattacc atagagaaca 1620 
aaatagacaa atacatctgc cctcatggaa ggtgacgttc ccaggagagg gcacctacac 1680 
agtcacgcaa acacacacta attcctggca gggcccccag cccctcccct ggctgagcag 1740 
ccctgtggct gaaatgacta gcagataaac agaccccbtt ctgctccgct tcctcctgcc 1800 
cagccaggca acaccctcaa ccggctccat cacatcdtca ggtctcggga ccatgggggg 1860 
ctcagagggg agacacacct actgcttcct cagatgggcc cctccgcagc cccttccctt 1920 
gctcggggaa agcccccaat tctgcccaca cccatttatt; tccttccttc cttccttctt 1980 
ttctttcctt ccttccttct tttttgtttt tgcccccaat tctgcccata cccatttctt 2040 
tctttccttc cttccttctt ttttgttttt gcccceagtt ctgtccacac cccttccctt 2100 
tcctgtcctg. tcctttcttt cttttttgat agaatcttge tctgtcgccc aggctgggag 2160 
tgcagtggtg agatctcagc tcactgcaac ctccacctcc tgggttgaag tgattctcgt 2220 
gcctcagcct cctgagtagc tgggactgca ggcacgcgcc accacgccca gctaattttt 2280 
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gtatttgagt agagacgggg tttcaccatg ttggccaggc tggtctcgaa ctccgcatct 
caggtgatct gctcgcctcg gcctcccaaa gtgatgggat tacaggcatg agccaccgtg 240U 
cccggcttca cacccatttc tttaaaaagg atcccgtagc aggcagaaaa gccccttcca 2460 
tcctgctcct ctgatactgt gcccccftgg agatatttcc gtcctccacc cacgtgtctg 2520 
tggctggaac tgcccagcct gctcctggcc ccctggaagc ctccccacag ctggtaatct 2b80 
ggacttaagg attgctgggc caccgcctct ctgcctacca ccattccata tttaagtgga 2640 
gcccctacgt agaaaggccc cggggcttta ttttagtctc cttttcaggg atgtcgtggg 2700 
cgggggaggg ggttcttggt gctacagccc tctccccacc cctaaaggga cgccgacgct 2760 
gtttgctgcc ttcaccacat attagtgctt gaccctggca ggggacccca tggaaaagat 2820 
ggggaagagc aaaatacatg gagacgacgc accctccagg atgctcgctg ggattcccac 2880 
gcccaccact gtcccccacc ccatggctgg gaggggcctc tgaacggaac agtgtcccca 2940 
cagagcgaat aaagcaaggc ttcttcccca aaaaaaaaaa aaaaaaaaaa attggtgcgg 3000 
ccgaagttat tccct£c 301 



<210> 9 

<211> 1735 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 1996726CB1 

<400> 9 

tcgggaggaa ggagactaca cctgctttgc 
agtcagagtc aaggtggtga cagcgcccgc 
tcaggtgccc tatggagacg tggtcactgt 
caaggtgact tggttgtccc caaccaacaa 
gatataccaa gatggcactc tccttattca 
cacctgcttg gtcaggaaca gcgcgggaga 
cgtccagcca cccaagatca acggtaaccc 
agccgggggc agtcggaaac tgattgactg 
gttatgggct tttcccgagg gtgtggttct 
tgtccatggc aacggttccc tggacatcag 
ggtatgcatg gcacgcaacg agggagggga 
ggagcccatg gagaaaccca tcttccacga 
gggccacacc atcagcctca actgctctgc 
ggtccttccc aatggcaccg atctgcagag 
ggctgacggc atgctacaca ttagcggtct 
cgtggcccgc aatgccgctg gccacacgga 
gccagaagca aacaagcagt atcataacct 
gctcccctgc acccctcccg gggctgggca 
catgcatctg gagggccccc aaaccctggg 
cacggttcgt gaggcctcgg tgtttgacag 
atacggccct tcggtcacca gcatccccgt 
cagcgagccc accccggtca tctacacccg 
ggctatgggg attcccaaag ctgacatcac 
ggcaggggtt caggctcgtc tgtatggaaa 
catccagcat gccacacaga gagatgccgg 
cggcagtgac tccaaaacaa cttacatcca 
tgcttaggaa ctgacaacaa agcggggttt 
tcttaaataa tgtgtcacag tgcatggtgg 
ttgatctaca attgttggga aaaggaagca 



<210> 10 

<211> 1016 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> raise feature 



tgaaaatcag gtegggaagg acgagatgag 60 
caccatccgg aacaagactt acttggcggt 120 
agcctgtgag gecaaaggag aacccatgcc 180 
ggtgatccce acctcctctg agaagtatca 240 
gaaagcccag cgttctgaca geggcaacta 300 
ggataggaag acggtgtgga ttcacgtcaa 360 
caaccecatc accaccgtgc gggagatagc 420 
caaagctgaa ggcatcccca ccccgagggt 4 80 
gccagctcca tactatggaa accggatcac 540 
gagtttgaijg aagagegact ccgtccagct 600 
ggccaggttg. ategtgeage tcactgtcct 660 
cccgatcagc gagaagatca cggccatggc 720 
cgcggggacc ccgacaccca gcctggtgtg 780 
tggacagcag ctgcagcgct tctaccacaa 840 
ctcctcggtg gaegcegggg cctaccgctg 900 
gaggctggtc tccctgaagg tgggactgaa 960 
ggtcagcatc atcaatggtg agaccctgaa 1020 
gggacgtttc tcctggacgc tccccaatgg 1080 
aegegtttet cttctggaca atggcaccct 1140 
gggtacctat gtatgcagga tggagacgga 1200 
gattgtgatc gcctatcctc cccggatcac 1260 
gcccgggaac acegtgaaac tgaactgeat 1320 
gtgggagtta ccggataagt cgcatctgaa 1380 
cagatttctt cacccccagg gatcactgac 14 40 
cttctacaag tgcatggcaa aaaacattct 1500 
cgtcttctga aatgtggatt ccagaatgat 1560 
gtaagggaag ccaggttggg gaataggagc 1620 
cctctgg^g g;tttcaagtt gaggttgatc 1680 
atgeagacac gagaaggagg gctca 1735 




8/20 



WO 00/21986 

<223> Incyte ID No.: 2137155CB1 



<400> 10 

ctgtacgttc 

atgggtcacc 

ctggactcca 

ttcctattca 

actaagcctt 

gtccaggtgg 

ggcttctaca 

tgaagaaagg 

aagtaaacta 

agcgctaaga 

acttttcctc 

gatatatttg 

ggcgaaatac 

aaccctggta 

agactgcact 

ccccgatgcc 

aggtcttaag 



ccctgtggcc 
tccaggtaga 
ttgcctcagt 
tcaatcagaa 
cttccttaaa 
acagttccca 
gcatgcaaaa 
caactaggat 
gaa^tttgtgc 
ccttactggg 
aagataactg 
cctgtaagat 
accgcacggt 
cactaaagca 
ggttgctgca 
ataacacctt 
cccaagtatc 



cacgcctagt 
ttacagagat 
tgtggttccc 
gaaacagtgg 
taatcagcta 
gagaatgcta 
acagaaccat 
gaggtttcaa 
acttgcttag 
atgggctctg 
accaagtgtt 
agctgtagag 
ggtgttggga 
gttcagtgtg 
aactcaggcc 
tggaatcccg 
tttctataca 



gaaaatgata 
aacaggcigc 
ataattatat 
ataccactgc 
gtatctgtgg 
agaattgcag 
ctacaggeag 
aagacggaag 
tggattgtat 
tctacagcaa 
tcttagaacc 
atatttgggg 
agaaaaattt 
ccagaggtta 
tgaatgagcg 
agcggccctc 
gtcccactgc 



tcgtacatct 
acccaagtga 
. gcctctctat 
tttgctggta 
actgcaagaa 
aaccagatgc 
acaatttcta 
acgactaaat 
tggattgtga 
tgtgcagaac 
aaagttttta 
tggggacagt 
gtcagcttgg 
tttttttccc 
gaaacaaaaa 
agaaaccttt 
ggtgagcgtg 



PCT/US99/233J5 



ccctagagat 60 
agattcttca 120 
tataatagca 180 
tcgaacacca 240 
aggaaccaga 300 
aagattcagt 360 
ccaaacagtg 420 
ctgctctaaa 4 80 
cttgatgtac 540 
aagcattccc 600 
aagttgctaa 660 
gagtttggat 720 
ctcggggaga 780 
attgctctga 840 
aagccttgcg 900 
tcaggcatcc 960 
ggggag 1016 



<210> 11 

<211> 2288 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No, 



2268890CB1 



<400> 11 

caaccagggt 

agttgggagt 

ctgccgagcg 

cagccccgca 

gcagagccag 

gaacccctcc 

gagccgcccc 

aaccaccatt 

ctggctgcca 

tcgccaagag 

tgcacctaca 

aaggagcctg 

aatgagctgc 

ggcggcattg 

gtcacgcagc 

gagctctccc 

agcaagtaca 

tcagagatca 

ccccagccac 

atcaaccaga 

cctctgccca 

ccatggagag 

gtgaagccgg 

gggggctgga 

gagacgtaca 

atttactggc 

ggccgcaaag 

aagctgcggc 

aagcagttca 

cagaagggag 

cgcgggggcc 



caggctgtgc 
tcaaatgagg 
tggcactgag 
ggacccctgg 
tggagcccag 
agaggccatg 
ggagccaagc 
ttgcaaggac 
tgggagctgt 
agttcattta 
ccttcattgt 
aggtgcttct 
tcaagcagaa 
tgagcgaggt 
tctacatgca 
agctggagaa 
aggacctgga 
tcgcgcagct 
cccccgctgc 
tctctaccaa 
ctatgcccac 
actgcctgca 
agaacaccaa 
ccgtcatcca 
agcaagggtt 
tgacgaacca 
tctttgcaga 
tggggcgcta 
ccaccctgga 
gctggtggta 
attaccggag 



tcacagtttc 
ctgctgcgga 
gcagcggctg 
ccagccctgg 
tgaggcaggg 
gacaggctgc 
aggagggaag 
catgaggcca 
tgcaggccag 
cctaaacagg 
gccccagcag 
ggagaaccga 
gcggcagatc 
gaagctgctg 
gctcctgcac 
caggatcctg 
gcacaagtac 
tgaggagcac 
cccgccccgg 
cgagatccag 
tctcaccagc 
ggccctggag 
ccgcctcatg 
gagacgcctg 
tgggaacatt 
aggcaactac 
atacgccagt 
ccatggcaat 
cagagatcat 
taacgcctgt 
ccgctaccag 



ctctggcggc 
cggcctgagg 
acgctactgt 
ccccagcctp 
ctgcttggca 
cccgctgacg 
aggctttcat 
ctgtgcgtga 
gaggacggtt 
tacaagcggg 
cgggtcacgg 
gtgcataagc 
gagacgctgc 
cgcaaggaga 
gagatcatcc 
aaccagacag 
cagcacctgg 
tgccagaggg 
gtctaccaap 
agtgaccaga 
ctcccatctt 
gatggccacg 
caggtgtggt 
gatggctctg 
gatggcga'at 
aaactcctgg. 
ttccg.cctgg 
gcgggtgact 
gatgitetaca 
gcccactcca 
gacggagtct 



atgtaaaggc 
atggacccca 
gagggaaaga 
tgccggagcc 
gccaccggcc 
gccagggtga 
agattctatt 
catgctggtg 
ttgagggcac 
cgggcgagtc 
gtgccatctg 
aggagctaga 
agcagctggt 
gccgcaacat 
gcaagcggga 
ccgacatgct 
ccacactggc 
tgccctcggc 
cacccaccta 
acctgaaggt 
ccaccgacaa 
acaccagctc 
gcgaccagag 
ttaacttctt 
actggctggg 
tgaccatgga 
aacctgagag 
cptttacatg 
caggaaactg 
acctcaacgg 
actgggctga 



tccacaaagg 60 
agccctggac 120 
aggttgtgag 180 
ctctgtggag 240 
tgcaactcag 300 
agcatgtgag 360 
cacaaagaat 4 20 
gctcggactg 480 
tgaggagggc 540 
ccaggacaag 600 
cgtcaactcc 660 
gctgctcaac 720 
ggaggtggac 780 
gaactcgcgg 840 
caacgcgttg 900 
gcagctggcc 960 
ccacaaccaa 1020 
caggcccgtc 1080 
caaccgcatc 1140 
gctgccaccc 1200 
gccgtcgggc 1260 
catctacctg 1320 
acacgacccc 1380 
caggaactgg 14 40 
cctggagaac 1500 
ggactggtcc 1560 
cgagtattat 1620 
gcacaacggc 1680 
tgcccactac 1740 
ggtctggtac 1800 
gttccgagga 18 60 



9/20 



WO 00/21986 PCT/US99/23315 

ggctcttact cactcaagaa agtggtgatg atgatccgac cgaaccccaa caccttccac 1320 
taagccagct ccccctcctg acctctcgtg gccattgcca ggagcccacc ctggtcacgc 1980 
tggccacagc acaaagaaca actcctcacc agttcatcct gaggctggga ggaccgggat 2040 
gctggattct gttttccgaa gtcactgcag cggatgatgg aactgaatcg atacggtgtt 2100 
ttctgtccct cctactttcc ttcacaccag acagcccctc atgtctccag gacaggacag 2160 
gactacagac aactctttct ttaaataaat taagtctcta caataaaaac acaactgcaa 2220 
agtaccttca taatatacat gtgtatgagc ctcccttgtg cacgtatgtg tatagcacat 2280 
atatatgg 2288 



<210> 12 

<211> 3304 f 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 2305981CB1 

<400> 12 

ccctcttatg gattcccagc aagcatcagg 
acacaagcat ggacaagtgt gtgtttccaa 
gcacccaaac ctccgggcat ttggcattgt 
caagcaagag tgtaagaaaa tccactgccc 
aatagacgga aagtgctgca aggtgtgtcc 
ccaaagcttt gacaataaag gctacttctg 
tgtattcatg gaggatgggg agacaaccag 
tcaggtagag gtccacgttt ggactattcg 
gaagatctcc aagaggatgt ttgaggagct 
cctgagccag tggaagatct tcaccgaagg 
tcgtgtatgc agaacagagc ttgaagattt 
aaagggccac tgttaggcaa gacagacagt 
ctgcagctgg actgcaggct tattttgctt 
aaatgcagtc aattattcac gccatgcaca 
tgtcagccct tgaacatctc ctccaaagag 
gaggagggat agaacatcac aacactgctc 
ggttaaagac aaacaagacc ccagggtttt 
agaagggaat tgcttagtag gagttctgca 
cctttgaatt ttagaatgtc atgtgttctt 
tcactccctc cctccctcct tctctctctc 
acacacacac acacacgcac acgcacgtcc 
agcaaagcta gccaaaattc tacgttactt 
agtttttgtg cccaggagag taaataactg 
tggctgttta agtcaccaac aatagagtca 
cattcattca cttagaagtg gtaataattt 
ctgtacctat gggacttcca gaaagaagtt 
catgtaagaa aaaataattg ttgaagaaag 
ttgctttcac atcaataaaa tttaccaagt 
accatagttg tctggtcaga aaaattatat 
agggaagttt tccttcttct ccaattatag 
gtcctcatga gcatctgcat gttgactctt 
ggtggatatt ctgatgaaga tctttatcct 
caagcagata ttttagtcaa gaattccaga 
cccaatacca gagcataaac tatccattct 
gaagacctaa ttcttcacag caaggatctc 
ggggcaggaa tgaactgtag aaatgtttta 
atgactatag gtgagagaat tctttcctaa 
aaatgttcag tctttatgac aacctggcat 
agggccttat ggccagggtt tcttgggaca 
ccttggaaga gagaagcagt acatcccggt 
gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt 
ttatgcggct gctccctccg tcccagaggt 
actagatcct aaggcaaaga ggtgtttctc 



aaccattgtg caaattgtca tcaataacaa 60 
tggaaagacc tattctcatg gcgagtcctg 120 
ggagtgtgtg ctatgtactt gtaatgtcac 180 
caatcgatac ccctgcaagt atcctcaaaa 240 
aggtaaaaaa gcaaaagaag aacttccagg 300 
cggggaagaa acgatgcctg tgtatgagtc 360 
aaaaatagca' etggagactg agagaccacc 420 
aaagggcatt ctccagcact tccatattga 4 80 
tcctcacttc aagctggtga ccagaacaac 540 
agaagctcag atcagccaga tgtgttcaag 600 
agtcaaggtt ttgtacctgg agagatctga 660 
attggatagg gtaaagcaag aaaactcaag 720 
aagtcaacag tgccctaaaa ctccaaactc 780 
gcataatttg ctcctttgtg tggagtggtg 840 
actagaagag tcttaaatta tatgtgggag 900 
tagtttcttg gagaatcaca tttctttaca 960 
tatctagaaa gttattcaag tgaaagaaag 1020 
gtatagaaca attacttgta tgaaattata 1080 
ttaaaaaaat tagctcccca tcctccctcc 1140 
tctctctctc cctctctcac agacacacac 1200 
acactcacat taaactaaag ctttatttga 1260 
ttcccttgac tggatcccaa gtagcttgga 1320 
tgaacaagag gctctgccct taggtctttg 1380 
gggtaaagaa taaaaacact ttcatagcct 14 40 
ttccctaatg ataccacttt tcttttcccc 1500 
aaat.tgagta aaatcatcag aaactgaatc 1560 
aagttgatag aattcaaaaa ggccatcttt 1620 
aatagatcag; tactcactaa tatttttgag 1680 
taaattagta- ajattctagaa gctctttaaa 1740 
gagttgattt ttactttgca aagtggctcg 1800 
cagttaagaa aattgttgtt catttaggga 1860 
aaaccttcct. actatccttg tcttattcat 1920 
gaaggctgct cctaaaatgt ctacttgcag 1980 
ggggtctgge. tttagaaatc atctttgtgg 2040 
aggcatgcct tctagatttg ttccctctga 2100 
aggacccaga aaccccatat gtctcattcc 2160 
gagggtttga taccaatagg ggaaaatgta 2220 
aaaggagtca- attcttatga aagagacaca 2280 
agactctcac cagcacatca cacacgttct 2340 
tgagaggtca caaagcatta gtttgtgtgt 2400 
gtgtgtgtgt gtggtaaagg ggggaaggtg 24 60 
ggcagtgatt ccataatgtg gagactagta 2520 
cttctggatg attcatccca aagccttccc 2580 
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acccaggtgt tctctgaaag cttagcctta agagaacacg ' cagagagttt ccctagatat 2640 
actcctgcct ccaggtgctg ggacacacct ttgca.aaatg ctgtgggaag caggagctgg 2700 
ggagctgtgt taagtcaaag tagaaaccct ccag.tgtttg • gtgttgtgta gagaatagga 2760 
catagggtaa agaggccaag ctgcctgtag ttagtag'aga agaatggatg tggttcttct 2820 
tgtgtattta tttgtatcat aaacacttgg aacaacaaag accataagca tcatttagca 2880 
gttgtagcca ttttctagtt aactcatgta aacaagtaag agtaacataa cagtattacc 2940 
ctttcactgt tctcacagga catgtaccta attatggtac ttatttatgt agtcactgta 3000 
tttctggatt tttaaattaa taaaaaagtt aattttgaaa aaaaaaaaaa aaaaaaaaaa 3060 
aaaaaaaaaa aaaaaaaaaa actcgagggg gggcctgtac cgggttcccc gtaacaggtt 3120 
cgcccttaag attccctggc cgcagttttt ggccgcgttt tggggaacct ctgggtaccc 3180 
ccttagttgc tcgctaaaat cccctttcgc agcccgttta aaggctgggg ccggccgatt 3240 
gccttcccaa tagcctccca tgaatgggaa tggaattgga agggaaattt tggtaaatcc 3300 
ggta 3304 



<210> 13 

<211> 708 

<212> DNA 

<213> Homo sapiens 



<220> • : ';: 

<221> misc_feature 

<223> Incyte ID No.: 2457612CB1 

<400> 13 : 

ggaaagccag gaagtgcagg aatcatttca tcagggcca'a taactacacc acccctgagg 60 

tcaacaccca ggcctactgg aactcccttg gagagaatag agacagatgt aaagcaacca 120 

acagttcctg cctctggaga agaactggaa aatataactg actttagctc aagcccaaca 180 

agagaaactg atcctcttgg gaagccaaga ttcaaaggac ctcatgtgcg atacatccaa 240 

aagcctgaca acagtccctg ctccattact gactetgtea aacggttccc caaagaggag 300 

gccacagagg ggaatgccac cagcccacca cagaacccac ccaccaacct cactgtggtc 360 

accgtggaag ggtgcccctt catttgtcat cttggactgg gaaaagccac taaatgacac 420 

tgtcactgaa tatgaagtta tatccagaga aaatgggtca ttcagtggga agaacaagtc 480 

cattcaaatg acaaatcaga cattttccac agtagaaaat ctgaaaccaa acacgagtta 540 

tgaattccag gtgaaaccca aaaacccgct tggtgaaggc ccggtcagca acacagtggc 600 

attcagtact gaatcagcgg acccagagtg agtgagcagt ttctgcagga gagatgcctc 660 

tggactgaag gccgctttgt tcgactcttg ctcaggtgta agggcaac 708 



<210> 14 

<211> 2040 , , . 

<212> DNA "v ' 

<213> Homo sapiens 

<220> ]■"'. ■ 

<221> misc__feature 

<223> Incyte ID No.: 2814981CB1 J 
<400> 14 • f£i--l t '. : : 

cggccagccg ccgcgcgctg cagctctccg ggacgceegt gcgccagctg cagaagggcg 60 
cctgcccgtt gggtctccac cagctgagca gcccgcgcta caagttcaac ttcattgctg 120 
acgtggtgga gaagatcgca ccagccgtgg tccacataga gct:cttcctg agacacccgc 180 
tgtttggccg caacgtgccc ctgtccagcg gttetggctt ea'tcatgtca gaggccggcc 240 
tgatcatcac caatgcccac gtggtgtcca gcaacagtgc tgccccgggc aggcagcagc 300 
tcaaggtgca gctacagaat ggggactcct atgaggccac catcaaagac atcgacaaga 360 
agtcggacat tgccaccatc aagatccatc ccaagaaaaa gctccctgtg ttgttgctgg 420 
gtcactcggc cgacctgcgg cctggggagt ttgtggtggc catcggcagt cccttcgccc 480 
tacagaacac agtgacaacg ggcatcgtca gcactgccca gcgggagggc agggagctgg 540 
gcctccggga ctccgacatg gactacatcc agacggatgc catcatcaac tacgggaact 600 
ccgggggacc actggtgaac ctggatggcg aggtcattgg catcaacacg ctcaaggtca 660 
cggctggcat ctcctttgcc atcccctcag accgcatcac acggttcctc acagagttcc 720 
aagacaagca gatcaaagac tggaagaagc gcttcatcgg catacggatg cggacgatca 780 



11/20 



WO 00/21986 



PCT/US99/23315 



caccaagcct ggtggatgag ctgaaggcca gcaacccgga cttcccagag gtcagcagtg 840 
gaatttatgt gcaagaggtt gcgccgaatt . caccttctca gagaggcggc atccaagatg 900 
gtgacatcat cgtcaaggtc aacgggcgtc ctctagtgga ctcgagtgag ctgcaggagg 960 
ccgtgctgac cgagtctcct ctcctactgg aggtgcggcg ggggaacgac gacctcctct 1020 
tcagcatcgc acctgaggtg gtcatgtgag gggcgcattc ctccagcgcc aagcgtcaga 1080 
gcctgcagac aacggagggc agcgcccccc cgagatcagg acgaaggacc accgtcggtc 1140 
ctcagcaggg cggcagcctc ctcctggctg tccggggcag agcggaggct gggcttggcc 1200 
aggggcccga atttccgcct ggggagtgtt ggatccacat cccggtgccg gggagggaag 1260 
cccaacatcc ccttgtacag atgatcctga aagtcacttc caagttctcc ggatattcac 1320 
aaaactgcct tccatggagg tcccctcctc tcctagcttc ccgcctctgc ccctgtgaac 1380 
acccatctgc agtatcccct gctcctgccc ctcctactgc aggtctgggc tgccaagctt 1440 
cttcccccct gacaaacgcc cacctgacct gaggccccag cttccctctg ccctaggact 1500 
taccaagctg tagggccagg gctgctgcct gccagcctgg ggtccctgga ggacaggtca 1560 
catctgatcc ctttggggtg cgggggtggg gtccagccca gagcaggcac tgagtgaatg 1620 
ccccctggct gcggagctga gccccgccct gccatgaggt tttcctcccc aggcaggcag 1680 
gaggccgcgg ggagcacgtg gaaagttggc tgctgcctgg ggaagcttct cctccccaag 1740 
gcggccatgg ggcagcctgc agaggacagt ggacgtggag ctgcggggtg tgaggactga 1800 
gccggcttcc ccttcccacg cagctctggg atgcagcagc cgctcgcatg gaagtgccgc 1860 
ccagaggcat gcaggctgct gggcaccacc ccctcatcca gggaacgagt gtgtctcaag 1920 
gggcatttgt gagctttgct gtaaatggat tcccagtgtt gcttgtactg tatgtttctc 1980 
tactgtatgg aaaataaagt ttacaagcac aaaaaaaaaa aaaaaaaaaa aaaaaaaagg 2040 



<210> 15 

<211> 2121 

<212> DNA 

<213> Homo sapiens 

<220> : , 

<221> misc_feature 

<223> Incyte ID No.: 3089150CB1 

<4 00> 15 

gtaaaagctg gttgtgatcg catcatagac tccaaaaaga agtttgataa atgtggtgtt 60 

tgcgggggaa atggatctac ttgtaaaaaa atatcaggat cagttactag tgcaaaacct 120 

ggatatcatg atatcatcac aattccaact ggagccacca acatcgaagt gaaacagcgg 180 

aaccagaggg gatccaggaa caatggcagc tttcttgcca tcaaagctgc tgatggcaca 240 

tatattctta atggtgacta cactttgtcc accttagagc aagacattat gtacaaaggt 300 

gttgtcttga ggtacagcgg ctcctctgcg gcattggaaa gaattcgcag ctttagccct 360 

ctcaaagagc ccttgaccat ccaggttctt actgtgggca atgcccttcg acctaaaatt 420 

aaatacacct acttcgtaaa gaagaagaag gaatctttca atgctatccc cactttttca 480 

gcatgggtca ttgaagagtg gggcgaatgt tctaagtcat gtgaattggg ttggcagaga 540 

agactggtag aatgccgaga cattaatgga cagcctgctt ccgagtgtgc aaaggaagtg 600 

aagccagcca gcaccagacc ttgtgcagac catccctgcc cccagtggca gctgggggag 660 

tggtcatcat gttctaagac ctgtgggaag ggttacaaaa aaagaagctt gaagtgtctg 720 

tcccatgatg gaggggtgtt atctcatgag agctgtgatp ctttaaagaa acctaaacat 780 

ttcatagact tttgcacaat ggcagaatgc agttaagtgg tttaagtggt gttagctttg 840 

agggcaaggc aaagtgagga agggctggtg cagggaaagc aagaaggctg gagggatcca 900 

gcgtatcttg ccagtaacca gtgaggtgta tcagtaaggt gggattatgg gggtagatag 960 

aaaaggagtt gaatcatcag agtaaactgc cagttg.caaa tttgatagga tagttagtga 1020 

ggattattaa cctctgagca gtgatatagc ataataaagc cccgggcatt attattatta 1080 

tttcttttgt tacatctatt acaagtttag aaaaaacaaa gcaattgtca aaaaaagtta 1140 

gaactattac aacccctgtt tcctggtact tatcaaatac ttagtatcat gggggttggg 1200 

aaatgaaaag taggagaaaa gtgagatttt actaagatct gttttacttt acctcactaa 1260 

caatgggggg agaaaggagt acaaatagga tctttgacca gcactgttta tggctgctat 1320 

ggtttcagag aatgtttata cattatttct accgagaatt aaaacttcag attgttcaac 1380 

atgagagaaa ggctcagcaa cgtgaaataa cgcaaatggp. ttcctctttc cttttttgga 14 40 

ccatctcagt ctttatttgt gtaattcatt ttgaggaaaa aacaactcca tgtatttatt 1500 

caagtgcatt aaagtctaca atggaaaaaa agcagtgaag cattagatgc tggtaaaagc 1560 

tagaggagac acaatgagct tagtacctcc aacttccttt; ctttcctacc atgtaaccct 1620 

gctttgggaa tatggatgta aagaagtaac ttgtgtotca tgaaaatcag tacaatcaca 1680 

caaggaggat gaaacgccgg aacaaaaatg aggtgtgtag' aacagggtcc cacaggtttg 1740 

gggacattga gatcacttgt cttgtggtgg ggaggctgbt gaggggtagc aggtccatct 1800 
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ccagcagctg gtccaacagt cgtatcctgg tgaatgtctg ttcagctctt ctgtgagaat 18 60 
atgatttttt ccatatgtat atagtaaaat atgttactat aaattacatg tactttataa 1920 
gtattggttt gggtgttcct tccaagaagg actatagtta gtaataaatg cctataataa 1980 
catatttatt tttatacatt tatttctaat gaaaaaaact tttaaattat atcgcttttg 2040 
tggaagtgca tataaaatag agtatttata caatatatgt tactagaaat aaaagaacac 2100 
ttttggaaaa aaaaaaaaaa a 2121 



<210> 16 
<211> 2900 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No.: 3206667CB1 

<400> 16 

gaagttttaa aaaaaactac agcagccaaa gaaactatat atatatatat atatatccag 60 
aatgattgcc tctactgtcc tcattgactt gtttgaacct tagtgcctta ccctgtcctc 120 
ttcccagttc tctttataga agctctagga gctttcgaaa agccaaagtc tttctgaaga 180 
atctgtgctg gacagacata attccctttc tcattgtctc catctttgtt ggtcatggta 240 
aggtttttcc atcagcctct gaaaaaatag ttgtgcacaa catctgctca ctggactgtc 300 
tgatccaatg taattggctg cgtctggcta attctaagca ctaaagtcta catctaagct 360 
atagatttaa gcttgaagct acagattata tcactateac caccacccct cacccagtga 4 20 
aatcagacag tcagtcatct taagttaaag atatttgttg tctttgaatg atttgctgtc 4 80 
acagactatt tggtagaaga aatatttttc acctgagaga ggaagagaaa tttctctagt 540 
aacacaaaga gtgagttcta aaaggcatgc ccacatctct ttcgtgcctt aaggatagtg 600 
agatgcacac ttatatatat actgtatata tttatata,tt tatatatata tttcatatat 660 
atatataata ttgcaagctt aagtttgcaa tttcccaaac aatacaaaaa gcaaattaca 720 
caccctcacc actgttctta tctctatagt gatgaaacat taattaggga tcttgctgct 780 
tttctttttc tacacgaagt tttcattaaa gccacagaat aattgatagg gcagctgttt 84 0 
gagaacaggt cccattttca cattagggct ttaaatgaat tagaaactat ttgaggctat 900 
aaaaatgtcc ttgagtttgg agcctgagct ctggtgaaat gctgatacat ctgatctatc 960 
atgggaattg cagttagaga gagtaaggaa taccatttag, tcatctatcc gttcttcact 1020 
tagcaggaat atgaaagaaa ggcacatgtt taagaggaat acctaaaggt ttttctaaat 1080 
tccaacattt aaaaggcaat tgtgggctat ttttattttt taatattttg aaataaagtt 1140 
tagtgtctag ggctgggagc caggactgat cttccatttc tttttctttg ttcccagcca 1200 
tgcttttgta acttgccagg tggacttgac caactacatt accatgctgt gcctcagttt 1260 
acccatttgt aaaatgggat taataatact tacctacctc acaggggtgt tgtgaggctc 1320 
tattcatttg ctcctttatt ctttcctgta ttctctgtat gtccagcact ttgtagccat 1380 
gggaggaaag ggactataaa agtgtacaat gttaatggaa tgatacggta cctgaaagcc 14 40 
ttgttttcta gtaagaaaat gctaccttgc tgtacatact tataaccttg tatttggaaa 1500 
tgagaaatag gtttatattt tcagatctct caaaaatcac atcatttgac caaagaataa 1560 
tttaagacac atagaacaga tttttttaat ttatattttc atcctgacca gcttagttct 1620 
aataattttt agttgtgagt gattaaaaaa ctttggatca attttggtca aacatgccaa 1680 
ctttgtagtc tgagtgacag gcaaggattt ttgggtttaa gatgcacttt tagcacacat 1740 
ttgtatttcc cttggcatat cagattgagc taatggtgat: gttatttcaa tctaacagcc 1800 
accaatctga aattgtattt caaatgttga ttctgtagtt: ctttaaataa taatgaagct 1860 
catcttatac attttgcttt caccaattga ttccttcttc, ttttagccca ctattaaaac 1920 
atttcttact gaatggttca tgtaggcttg ctgaaca^cav cgcattactt gcttcctgaa 1980 
gagttccccc attcatccat ttgtcccatt agttgc.tgtg gattatcaag ttttgaagga 2040 
actgtacatc ccaacagact gaaacattct aagtgaa'atg. agtataatcc aagtaactgg 2100 
tgaactttgg aggtttggag cttgaagaga atggctaaga agatttgaat tatagggagg 2160 
gaacagaaat catacatgaa aaggttttac tgagaagggg aaaaccttag atagagggac 2220 
atgtgaaaca aaatcatttg aaattttgat tcagacktgc. atttccagtg gcaaacagca 2280 
aagcctgaac ccataaaccc aaatgatagg tgaagttggg . tggttttatc caatgtctca 2340 
agcaagcaat gtctgggaat atcatagagt aacaagtgct- ggtcagccaa agaaacattc 2400 
actgctggtg aaccaatacc ataagcatgt attatctaag; cacttgatca agaaatatac 2460 
atgttgtaca agctctcaat tttgttcatt tattatcaaa tttttaaaat acaagtttgg 2520 
tatgtgattt ggaaaagatg ccttctggat cttaagccag. ttgtcagtgg aggtcctcag 2580 
ggctgcaaat gtcaagacat aaccctgttc ctcaccatca; tgataccaga tacaggtgaa 264 0 
tacataggaa ctatctgcct gtgtcctcaa tctcccttca aacaagatgc tgatttgtag 2700 
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ggtacttggc aggttaaatt aaaccagaag 
tagggtataa agatctcata agaaatgtaa 
aaatatacat tgtttgcgct agaatagaaa 
ctctaaaaaa aaaaaaaaaa 



aggtgactta ataaaaaagg gaatgacatt 27 60 
tatgtaaatt atatcttgct ttatgttgta 2820 
tgatttcttt tcaataaaaa gaaagaagga 2880 
" : 2900 



<210> 17 

<211> 2507 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 3284695CB1 

<400> 17 

cagagtgaaa cttgtgcctg gtgaccaaag 
ttcaagccaa atatctgggt ttccccctct 
ccaagatagg agatatttcc catccccttc 
gcccaggagc ctattcctgg catggatgtt 
agacccttca agcagcctgg ctggggccca 
gcttttccct tcctcaccac ccaccacagc 
tccagcctga gccatgtgtg cccctgcggg 
tccctcccag catccctgcg gaaggagtca 
tgatggggaa gggttcccca gtccccacag 
ccttctgtgt cacggcgggc tgtgcaccca 
cactgcagta tattcttgcc aaagatttcc 
ttttgtaaat gtttatcttc ttctgtcttc 
attgttgaat ctgtgtgtca gccaggagag 
gggaaagggt ctgggagaag atgggcaaca 
acgcagaccc cagcaggttc agtcccgtgc 
gggaagaggg cagaggaggg tcatgtccct 
cgtggctttt tcccaaaggg agcaagaggg 
gacctgcgaa ggaaaacagg gaggaagtga 
ctggctctct tatttagcca ggcgcttaag 
aaggcctttg acccatgtca tctgagcgtc 
caatggccag gattccttct cccctggttt 
caggagaggg atggtggggc cagtggttgt 
aagtgtgatc cccctataaa cggctctcag 
tctgatgagc ctgtgcaggg gctccagggg 
gtgagtgtga tcaaatctag tctcactccc 
accacccctg cctcctggat cttctcccac 
tcctgtgagt caaggcagac acccaatcct 
ggggggcaga gtcccagagc agccctttac 
cgcgtttcct tggccagtgg taacacagga 
tgtgtgtgcg tgtgttttgc tcatttcttt 
tgggcaatgg aacttcaaat tcaatgtcgc 
ctgtaggcca accaattggt ggagtctcag 
gaggggcagg gtgggggcct cgggcagatc 
tccaaaatgt tggaggacct ctgttcatat 
ttactgtaga gggatgtccc aagcttgttt 
cctgtgtctg tgttttgttt gtgcgtgtgt 
tttccccatt tctctcctcc cttcagaccc 
cccaccaccc tccctgcctc ccaggccctc 
tccccacccc agctgtgtat ttatatagat 
gcctatagcc gctgccaccg tgtataaatc 
tgtattgtac actgacgcgt ccccactcct 
tgtatggctt tataaatgat aaagttaaag 



■ ■ ■" ■ < v. 



1 - 



tccctccaaa gtgctcttcc ttctgggtta 60 
cctcattccc tagcaaaccc caattatctt 120 
ctttgtaaat atctcatctc ccactggaga 180 
ctgtccacac ttgaggctgg gcggtgtatc 240 
ggactgagtc tggggtcagc tttcacggtc 300 
ccaccttgca tgcatggcca gcccctccac 360 
aggacccatt catgccagaa agctggtaac 420 
gtttctgaga gtgtgacttt tcaaggcgaa 480 
tggccccacc tctgggccct gcaccagagc 540 
tgcacacacc tacgcacaca caacactccg 600 
tttaaaagca agcactttta ctaattatta 660 
tccctccctg aatctatttt actgttgttt 720 
cgctgtctgg ccttgaacat gggctgggat 780 
aagagccagg gagtcatgga catcgcagcg 840 
tgccaccagc tgtccagctg ggtgtctgga 900 
tcagc^gggg gaggggccca gtgagctcca 960 
aaggattggg cgagaaaaca atggagaggg 1020 
gcggtttgat cagcctgcta tcacggtgtt 1080 
ggacagatac atcacatcct aagtttggga 1140 
tcctccagta gctctgaaag ctgtggacac 1200 
ttgaggatcc etgggtcttc tgagactggc 1260 
gtgaaagcag gaggggcagc cctcctggac 1320 
gaggttagtg agtaggagat tctgccttgt 1380 
agcatgctgt ccagggggca cagaagggtg 1440 
acttttttag tctcactcct acttttgtcc 1500 
tttttttttc agctttagga cctggggaga 1560 
gcccccacac tcggggtcct ccaagaggtt 1620 
cccaggtcca ggccctggaa tcctgagact 1680 
cgtgtgtgcg catgtgcaag tgtggatgta 1740 
agggaacttg ggagtcgggg ttggaggtgc 1800 
ccagcagtga ggggagtcgg gaggtgaggc 1860 
cgatagccca ggtgagaagt ggttcaccca 1920 
tgtccGtttt ggcccctctg tcctcaaatg 1980 
cccacgdctg ggctcttgcc agcagtggag 2040 
tccaatcajgt gttaagctgt ttgaaactct 2100 
gtgagagcac a'tcagtgtgt gcaggctgtg 2160 
atcattgacja acaaatgtaa gaaatccctt 2220 
tgcgggggaa acaagatcac ccagcatcct 2280 
ggaaatatac tttatatttt gtatcatcgt 2340 
ctggtgtatfg ctccttatcc tggacatgaa 24 00 
gtacagetgc tttgtttctt tgcaatgcat 24 60 
aaaaaaaaaa aaaaagg 2507 



<210> 18 
<211> 2929 
<212> DNA 
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<213> Homo sapiens 

<220> : VV 

<221> misc_feature 

<223> Incyte ID No.: 3481610CB1 . «,/-.',>■ 
<400> 18 . v - 

aagctcggaa ttcggctcga gatgggttcc tcatcccttc^ctgctgcaaa agaagttaac 60 
aaaaaacaag tgtgctacaa acacaatttc aatgcaagct cagtttcctg gtgttcaaaa 120 
actgttgatg tgtgttgtca ctttaccaat gctgctaata attcagtctg gagcccatct 180 
atgaagctga atctggttcc tggggaaaac atcacatgcc aggatcccgt aataggtgtc 240 
ggagagccgg ggaaagtcat ccagaagcta tgccggtt^ct caaacgttcc cagcagccct 300 
gagagtccca ttggcgggac catcacttac aaatgtgtag gctcccagtg ggaggagaag 360 
agaaatgact gcatctctgc cccaataaac agtctgctcc agatggctaa ggctttgatc 420 
aagagcccct ctcaggatga gatgctccct acatacctga aggatctttc tattagcata 480 
ggcaaagcgg aacatgaaat cagctcttct cctgggagtc tgggagccat tattaacatc 540 
cttgatctgc tctcaacagt tccaacccaa gtaaattcag aaatgatgac gcacgtgctc 600 
tctacggtta atatcatcct tggcaagccc gtcttgaaca cctggaaggt tttacaacag 660 
caatggacca atcagagttc acagctacta cattcagtgg aaagattttc ccaagcatta 720 
cagtcaggag atagccctcc attgtccttc tcccaaacta atgtgcagat gagcagcatg 780 
gtaatcaagt ccagccaccc agaaacctat caacagaggt ttgttttccc atactttgac 840 
ctctggggca atgtggtcat tgacaagagc tacctagaaa acttgcagtc ggattcgtct 900 
attgtcacca tggctttccc aactctccaa gccatccttg . ctcaggatat ccaggaaaat 960 
aactttgcag agagcttagt gatgacaacc actgtcagcc acaatacgac tatgccattc 1020 
aggatttcaa tgacttttaa gaacaatagc ccttcaggcg gcgaaacgaa gtgtgtcttc 1080 
tggaacttca ggcttgccaa caacacaggg gggtgggaca gcagtgggtg ctatgttgaa 114 0 
gaaggtgatg gggacaatgt cacctgtatc tgtgaccacc taacatcatt ctccatcctc 1200 
atgtcccctg actccccaga tcctagttct ctcctgggaa tactcctgga tattatttct 1260 
tatgttgggg tgggcttttc catcttgagc ttggcagcct gtctagttgt ggaagctgtg 1320 
gtgtggaaat cggtgaccaa gaatcggact tcttatatgc gccacacctg catagtgaat 1380 
atcgctgcct cccttctggt cgccaacacc tggttcattg jtggtcgctgc catccaggac 1440 
aatcgctaca tactctgcaa gacagcctgt gtggctgpca ccttcttcat ccacttcttc 1500 
tacctcagcg tcttcttctg gatgctgaca ctgggcctca tgctgttcta tcgcctggtt 1560 
ttcattctgc atgaaacaag caggtccact cagaaagcca ttgccttctg tcttggctat 1620 
ggctgcccac ttgccatctc ggtcatcacg ctgggagcca cccagccccg ggaagtctat 1680 
acgaggaaga atgtctgttg gctcaactgg gaggacacca aggccctgct ggctttcgcc 1740 
atcccagcac tgatcattgt ggtggtgaac ataaccatca ctattgtggt catcaccaag 1800 
atcctgaggc cttccattgg agacaagcca tgcaagcagg agaagagcag cctgtttcag 1860 
atcagcaaga gcattggggt cctcacacca ctcttgggcc tcacttgggg ttttggtctc 1920 
accactgtgt tcccagggac caaccttgtg ttccatatca tatttgccat cctcaatgtc 1980 
ttccagggat tattcatttt actctttgga tgcctctggg atctgaaggt acaggaagct 2040 
ttgctgaata agttttcatt gtcgagatgg tcttcacagc actcaaagtc aacatccctg 2100 
ggttcatcca cacctgtgtt ttctatgagt tctccaatat caaggagatt taacaatttg 2160 
tttggtaaaa caggaacgta taatgtttcc accccagaag caaccagctc atccctggaa 2220 
aactcatcca gtgcttcttc gttgctcaac taagaacagig ataatccaac ctacgtgacc 2280 
tcccggggac agtggctgtg cttttaaaaa gagatgcttg caaagcaatg gggaacgtgt 2340 
tctcggggca ggtttccggg agcagatgcc aaaaagactt tttcatagag aagaggcttt 2400 
cttttgtaaa gacagaataa aaataattgt tatgtttct'g tttgttccct ccccctcccc 24 60 
cttgtgtgat accacatgtg tatagtattt aagtgaaact caagccctca aggcccaact 2520 
tctctgtcta tattgtaata tagaatttcg aagagacatt ttcacttttt acacattggg 2580 
cacaaagata agctttgatt aaagtagtaa gtaaaaggct acctaggaaa tacttcagtg 2640 
aattctaaga aggaaggaag gaagaaagga aggaaagaag ggagggaaac agggagaaag 2700 
ggaaaaagaa gaaaaagaga tagatgataa taggaabaaa taaagacaaa caacattaag 27 60 
gggcatattg taagatttcc atgttaatga tctaatataa tcactcagtg ccacattttg ' 2820 
agaatttttt tttttaatgg gcttcaaaaa ttggaaaact gtgaaagcta agtccattgg 2880 
ggggaatgga attacttttg ggggccagta tcttt'ccttt gattgttcc 2929 



<210> 19 
<211> 1725 
<212> DNA 

<213> Homo sapiens 
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<220> 

<221> misc_feature 

<223> Incyte ID No. : 3722004CB1 

<400> 19 

gaggcaagaa ttcggcacga gggagagccc 
accgggggag cccgaacgag ggggatcccg 
ggtggaggcg aggcaggaag aggagcagga 
agggcctgag gaggaggacg gagaaggctt 
gggaaaccag tacaagaaga tgatgaccaa 
gctgacctct gacctcactt ccctgtagca 
ttgcaaatcc ttttgaactg aagaataacg 
ttccttttgg catcttaaaa gcttgagaga 
caggctccca gggtgcatgc tgcctccata 
acttgtccct tggctagcag gatcctggga 
ttcatgtctg ttcctgtggg tcactttgtt 
gaccctggac tgggattttt cttaccactc 
tagataaaaa gaacatttta aaagcagagt 
cctcactgaa gccaaaccac agaagacttt 
acctgtgctc accagctccg tcagggtggt 
tctctctgtg gctggcttgg ttgtcggggg 
agaaccagta ccaggaattt acttgaccat 
gattcagccc tttcattgct aagacacctt 
tctccactct gctatagcag aagcaataat 
ccttttctta gaaagtttga tagattagtt 
tgggtttctt ggaattttat atttgacaat 
cttaggtttg ttggttaaaa cattttttta 
ttaatggaag gctggggaat gtccagcatc 
tctcatctgg gcctggaacc tttggttcag 
agccacacag tcattgcctt caacacagag 
ccttgtccag gctgggatct aattgataca 
atctaggttt gtctggaaag tttccgaccc 
ctcatggctt ggatctctgt attcagcctt 
tctcaaaaaa aaaaaaaaaa aggccggcgc 



<210> 20 

<211> 1987 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No.: 3948614CB1 

<400> 20 

gacggccagt gcaagctaaa attaaccctc 
agctctcggc ctcggcttcg acgacggcaa 
gacggacgcg gggctgtaca cctgcaacct 
cctggccgtc cgcctggagg tcaccgacgg 
cgagaaggag gtgctggcgg tggcgcgcgg 
cgggcacgtg tggaccgacc ggcacgtgga 
gcagccgccc ggggtcccgc acgaccgcgc 
cgagcgccgc gcctacgggc ccctttttct 
ctttgagcgc ggtgacttct cactgcgtat 
ctactcctgc cacctgcacc accattactg 
gacggtcgcc gaaccccacg cggagccgcc 
ccacagcggc gccccaggcc cagaccccac 
catcgtcccc gagagccgag cccacttctt 
gctgctcttc atcctgctac tggtcactgt 
ctacgaatac tcggaccaga agtcgggaaa 
gttcgctgtg gctgcagggg accagatgct 
caaaaacaac atcctgaagg agagggcgga 



gcgggcgtgg gggagctcgg ggacctgcgg 60 
cgcjcggcgcc agcgaggcgg aggagcaggc 120 
cttggatggt gagaaggggc catcatcgga 180 
ctccttfcaaa tacagccccg ggaagctgag 240 
agaggagctg gaggaggagc agaggattga 300 
agttccttag gtcctgagcc acaaatattc 360 
aagttatcct tagcgtectc ctaaaggctt 420 
taaaacggaa accccagaga ggagtctggg 480 
aatctgctga gctctagacc ctcaatcagg 540 
acacctttgg ccctgccctg tgtagagatg 600 
aagctgaaga gttttaagag gtagagctca 660 
aaacttgcta -tccacacacc ctgcacacct 720 
tcactttcac tccagtctcc cctcttttgc 780 
gaggaatgag agacaaatga ggtagagctc 84 0 
cagccgaccc ctttccctgg gaaccccact 900 
tgagatgcca -tattgattac agggcagcaa 960 
tccccttatt tttcatctag aggaatctcg 1020 
ttcactgagg ttcttaccag ctcagccaaa 1080 
gtttgcttta aaaagatttc ttgacctatg 1140 
agaacttcag atcatcagat cagtctcaaa 1200 
atttatacta taccaaactc atttgcagtt 12 60 
aagcagtaag tttatagaaa atgttttcat 1320 
aacccctatg gcatgcattc ccagtggcct 1380 
ggcttagggg agaacaggcc acatggcaac 14 40 
ccacgtgtcc ccaaacagca atagtcatgc 1500 
ata.ggtcgtt gactccctcc tagtagagct 1560 
tggcttatag gcaccacacc tcatgtactc 1620 
tgttca'gtcc aataaacttt gagtagatga 1680 
aagcttattc ctttt 1725 



j - , '. ; 

: 

actaaaggga ataagcttgc ggccgcctgg 60 
cttctcgctg ctcatccgcg cggtggagga 120 
gcaccatcac tactgccacc tctacgagag 180 
ccccccggcc acccccgcct actgggacgg 240 
cgcacccgcg cttctgacct gcgtgaaccg 300 
ggaggctcaa caggtggtgc actgggaccg 360 
ggaccgc'ctg ctggacctct acgcgtcggg 420 
gcgcgaccgc gtggctgtgg gcgcggatgc 480 
cgagccjgctg gaggtcgccg acgagggcac 540 
tggcctgcac gaacgccgcg tcttccacct 600 
cccccggggc tctccgggca acggctccag 660 
actggcgcgb ggccacaacg tcatcaatgt 720 
ccagcagctg ggctacgtgc tggccacgct 780 
cctcctggcb gcccgcaggc gccgcggagg 840 
gtcaaagggig aaggatgtta acttggcgga 900 
ttacaggagt gaggacatcc agctagatta 960 
gctggcccac agccccctgc ctgccaagta 1020 
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catcgaccta gacaaagggt tccggaagga 
ggctgggcca gcagctgcac ctctcctgtc 
ggggctcacc ccccttccag cggctggtcc 
cagaggccgc ctccacaccc ctcccccagg 
cggcctttgc tcacgggtgg ccctgcccac 
catcatgccc tcagaccctt ctgggctctg 
gacactccca tcagaacctg gcagccccaa 
actcctccag ggctctgctc gtccggggct 
tcagaacttg gcagccttga agttggggtc 
gtgctgcctg ccaccaagag ctcccccacc 
ctgttctccc cagggacctg ctgacttgaa 
ggccacctgg ggctgcaccc cctgcccttt 
tcagccacct tgatagtcac tgggctccct 
gactctgcct gggctggagt ctagggctgg 
acaggggagg gagtgaagtt ggtttggggt 
tttgcatctg ctggtggacc tgccaccatc 
aaaaaaa 



gaactgcaaa tagggaggcc ctgggctcet 1080 
tgtgctcctc ggggcatctc ctgatgctcc 1140 
cgctttcctg gaatttggcc tgggcgtatg 1200 
ggcttggtgg cagcatagcc cccacccctg 1260 
cccfggca'ca accaaaatcc cactgatgcc 1320 
cccgctgggg gcctgaagac attcctggag 1380 
aactggggtc agcctcaggg caggagtccc 1440 
gggagatgtt cctggaggag gacactccca 1500 
agcct'cg^ca ggagtcccac tcctcctggg 1560 
tgtaccacca tgtgggactc caggcaccat 1620 
tgccagccct tgctcctctg tgttgctttg 1680 
ctctgcccca tccctaccct agccttgctc 1740 
gtgacttctg accctgacac ccctcccttg 1800 
ggctacattt ggcttctgta ctggctgagg 1860 
ggcctgtcjtt gccactctca gcaccccaca 1920 
acaataaagt ccccatctga tttttaaaaa 1980 

1987 



<210> 21 

<211> 551 

<212> PRT 

<213> Homo sapiens 

<220> , •• 

<221> misc_feature 

<223> Incyte ID No.: 627722CD1 

* ,«■■-' 

<400> 21 

Met Glu Glu Ala Glu Leu Val Lys Gly Arg ^Ljeu Gin Ala lie Thr 

1 5 ' 15 
Asp Lys Arg Lys He Gin Glu Glu He Ser^Gln Lys Arg Leu Lys 

20 25 -^;^; y . 30 

He Glu Glu Asp Lys Leu Lys His Gin His Leu' Lys Lys Lys Ala 

35 40 ^ J 45 

Leu Arg Glu Lys Trp Leu Leu Asp Gly lie .Ser Ser Gly Lys Glu 

50 55,K~; V 60 

Gin Glu Glu Met Lys Lys Gin Asn Gin Gin ' Asp Gin His Gin He 

65 70 r' : 75 

Gin Val Leu Glu Gin Ser lie Leu Arg Leu Glu Lys Glu He Gin 

80 85 . 90 

Asp Leu Glu Lys Ala Glu Leu Gin He Ser Thr Lys Glu Glu Ala 

95 100 105 

He Leu Lys Lys Leu Lys Ser He Glu Arg Thr Thr Glu Asp He 

HO 115 120 

He Arg Ser Val Lys Val Glu Arg Glu Glu Arg Ala Glu Glu Ser 

125 130 135 

He Glu Asp He Tyr Ala Asn lie Pro Asp Leu Pro Lys Ser Tyr 

140 145 J : ; 150 

He Pro Ser Arg Leu Arg Lys Glu He Asn Glu Glu Lys Glu Asp 

155 ' 160 .-;:: : 165 

Asp Glu Gin Asn Arg Lys Ala Leu Tyr Ala' Met Glu He Lys Val 

170 I 75:;^v> 180 

Glu Lys Asp Leu Lys Thr Gly Glu Ser ThrViVal Leu Ser Ser He 

185 ldttjj^j; 195 

Pro Leu Pro Ser Asp Asp Phe Lys Gly Thr. GiLy lie Lys Val Tyr 

200 20S/^ .... 210 

Asp Asp Gly Gin Lys Ser Val Tyr Ala Val ' Ser; Ser Asn His Ser 

215 220. . 225 

Ala Ala Tyr Asn Gly Thr Asp Gly Leu Ala/Pro; Val Glu Val Glu 

230 235 v vtJ 240 

Glu Leu Leu Arg Gin Ala Ser Glu Arg Asn Ser Lys Ser Pro Thr 

245 250 ;; \ 255 
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GlU 


Tyr 


His Glu 


rnr 


Pro 


Gin Arg 


Arg 


Tl n 

lie 


Lys lie 


Ser 


lie 


His Asn 


Asn 


pne 


Asn His 


ber 


val 


Tift p1 n 

lie bin 


TV y^r 


Leu 


Met mr 


Asp 


Ala 


Pro Ser 


Phe 


Gly 


Lys Ser 


Asp 


Glu 


Glu Asp 


Asp 


lie 


Asn Asp 


Gin 


Gin 


Ala Glu 


Tyr 


Asp 


Gly lie 


Glu 


Glu 


Glu Asp 


lie 


Ala 


Pro His 


Leu 


Pro 


Arg Lys 


His 


Lys 


Ser Pro 


Glu 


Ser 


Leu Gly 


Thr 


Thr 


Gly Asp 


Met 


Arg 


Met Ala 



Pro Val 
260 

Glu Thr 
275 

Lys Thr 
290 

Met Gly 
305 

He Ser 
320 

Gin Ala 
335 

Pro Trp 
350 

Pro Lys 
365 

Glu His 
380 

Val Arg 
395 

Thr Glu 
410 

Asp Ser 
425 

He His 
440 

Glu Gly 
455 

Ser Gin 
470 

Arg Ser 
485 

His Lys 
500 

Ser Pro 
515 

Gly Thr 
530 

Lys Leu 
545 



Tyr Ala 
Val Thr 
Asn Gly 
Asn Gly 
Pro He 
Glu Glu 
Glu Glu 
Pro Arg 
Gin Asn 
Tyr Asn 
Pro Val 
Glu Glu 
Ala Glu 
Glu Ala 
Val Tyr 
Glu Ala 
Asn Ser 
Val His 
Glu Asp 
Gly Lys 



Asn Pro Phe 

265 
Pro Gly 'Pro 

280^;' 

Leu Gly::Ile 
295 -£. £ 
Leu Ser Glu 

3*i a • \v 

Pro Pro Val 

325 
Lys Leu His 

340/::. 

Ser Asn Val 

355 
Leu Ser Pro 

370 : 
Ser Ser Pro 

385 
He Val His 

400 
Thr Met He 

415 

Asp Lys Lys 

430 
Leu Val Val 

445 . ; 
Glu Lys- Pro. 

4 60- ]y 
Gin Pro -Ala; 

475 
Ser Pro His 

490 ■ 
He Ser Leu. 

505 
His Ser 'Pro 

520 
Pro Ser Leu 

535 
Lys Val lie 

550 



Tyr Arg 
Asn Phe 
Gly Val 
Glu Arg 
Pro His 
Thr Pro 
Met Gin 
Arg Glu 
Thr Cys 
Ser Leu 
Phe Met 
Phe Leu 
He Asp 
Ser Tyr 
Lys Pro 
Glu Asn 
Lys Glu 
Phe Asp 
Thr Ala 



Pro Thr 

270 
Gin Glu 

285 
Asn Glu 

300 
Gly Asn 

315 
Pro Arg 

330 
Gin Lys 

345 
Asp Lys 

360 
Thr He 

375 
Gin Glu 

390 
Pro Pro 

405 
Gly Tyr 

420 
Thr Gly 

435 
Asp Glu 

450 
His Pro 

465 
Thr Pro 

480 
Thr Asn 

495 
Gin Glu 

510 
Ala Gin 

525 
Leu Arg 

540 



<210> 22 

<211> 99 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 1556751CD1 

<400> 22 



Met 


Glu 


Ala 


Leu 


Ala 


Asn 


Val 


Asn Phe Pro Arg Lys Ser Phe Arg 


1 








5 






10 V- : • 15 


Pro Glu Asp 


Ala 


Gly 


Lys 


Glu 


Ser Gly Ser Gin. Gly Gly Phe Cys 










20 






25 y 'i ; 30 


Val 


Pro 


Ala 


Ala 


Arg 


Pro 


Gin 


Thr Met Val Thr Gly Pro Ser Cys 










35 






40- Vf \ : 45 


Ser 


Ser 


Pro 


Gly 


Leu 


Gin 


Asn 


Phe Ser Pro Gin Arg Lys Glu Asn 










50 






55 „ 60 


Arg 


Ala 


Cys 


Ala 


Cys 


Trp 


Gin 


Asn Ala Gly Pro Ala Pro Lys Asn 










65 






- 70 ..y 75 


Pro 


Met 


Cys 


Val 


Arg 


Leu 


Lys 


Val Gly Arg Pro Gin Ala Ser Gin 
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80 85 
Arg Lys Leu Lys Glu Thr Gly Leu Cys 

95 



90 



<210> 23 

<211> 493 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 2268890CD1 

<400> 23 



Met 


Arg 


Pro 


Leu 


Cys 


Val 


Thr 


Cys 


Trp Trp Leu Gly Leu 


Leu Ala 


1 






5 








10 


15 


Ala 


Met 


Gly 


Ala 


Val 


Ala 


Gly 


Gin 


Glu Asp Gly Phe Glu 


Gly Thr 








20 








25 


30 


Glu 


Glu 


Gly 


Ser 


Pro 


Arg 


Glu 


Phe 


He Tyr Leu Asn Arg 


Tyr Lys 








35 








40 


45 


Arg 


Ala 


Gly 


Glu 


Ser 


Gin 


Asp 


Lys 


Cys Thr Tyr Thr Phe 


lie Val 






50 








55 


60 


Pro 


Gin 


Gin 


Arg 


Val 


Thr 


Gly 


Ala 


He Cys Val Asn Ser 


Lys Glu 








65 






70 


75 


Pro 


Glu 


Val 


Leu 


Leu 


Glu 


Asn 


Arg 


Val His Lys Gin Glu 


Leu Glu 










80 






85 


90 


Leu 


Leu 


Asn 


Asn 


Glu 


Leu 


Leu 


Lys 


Gin Lys Arg Gin lie 


Glu Thr 








■ 


95 






■ 


100 


i r\ c 

105 


Leu 


Gin 


Gin 


Leu 


Val 


Glu 


Val 


Asp 


Gly Gly He Val Ser 


Glu vai 










110 








115* 


120 


Lys 


Leu 


Leu 


Arg 


Lys 


Glu 


Ser 


Arg 


Asn Met Asn Ser Arg 


val Tnr 






125 








130 




Gin 


Leu 


Tyr 


Met 


Gin 


Leu 


Leu 


His 


Glu He lie Arg Lys 


Arg Asp 








140 








145 


1 CIO 


Asn 


Ala 


Leu 


Glu 


Leu 


Ser 


Gin 


Leu 


Glu Asn Arg He Leu 


Asn l»iii 








155 








160 


loo 


Thr 


Ala 


Asp 


Met 


Leu 


Gin 


Leu 


Ala 


Ser Lys Tyr Lys Asp 


j_>eu uiu 








170 








175 




His 


Lys 


Tyr 


Gin 


His 


Leu 


Ala 


Thr 


Leu Ala His Asn Q»in 








185 








190 


1 


lie 


He 


Ala 


Gin 


Leu 


Glu 


Glu 


His 


Cys Gin Arg Val Pro 


Ser Ala 










200 








205 


210 


Arg 


Pro 


Val 


Pro 


Gin 


Pro 


Pro 


Pro 


Ala Ala Pro Pro Arg 


Val Tyr 








215 








220 


225 


Gin 


Pro 


Pro 


Thr 


Tyr 


Asn 


Arg 


He 


He Asn Gin lie Ser 


Thr Asn 










230 








235 


240 


Glu 


He 


Gin 


Ser 


Asp 


Gin 


Asn 


Leu 


Lys Val Leu Pro Pro 


Pro Leu 










245 








250 , r . 


255 


Pro 


Thr 


Met 


Pro 


Thr 


Leu 


Thr 


Ser 


Leu Pro Ser Ser Thr 


Asp Lys 








260 








265 ,, 


270 


Pro 


Ser 


Gly 


Pro 


Trp 


Arg 


Asp 


Cys 


Leu Gin Ala Leu Glu 


Asp Gly 








275 








280 


285 


His 


Asp 


Thr 


Ser 


Ser 


He 


Tyr 


Leu 


Val Lys Pro Glu Asn 


Thr Asn 








290 








295 ; 


300 


Arg 


Leu 


Met 


Gin 


Val 


Trp 


Cys 


Asp 


Gin Arg .His Asp Pro 


Gly Gly 








305 






310 , 


315 


Trp 


Thr 


Val 


He 


Gin 


Arg 


Arg 


Leu 


Asp Gly Ser Val Asn 


Phe Phe 








320 








325 ... . .. : 


330 


Arg 


Asn 


Trp 


Glu 


Thr 


Tyr 


Lys 


Gin 


Gly Phe Gly Asn lie 


Asp Gly 






335 








340 


345 


Glu 


Tyr 


Trp 


Leu 


Gly 


Leu 


Glu 


Asn 


lie Tyr Trp Leu Thr 


Asn Gin 






350 








355 \ ... , * 


360 
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Gly Asn Tyr Lys Leu Leu Val Thr Met Glu Asp Trp Ser Gly Arg 

365 370 375 

Lys Val Phe Ala Glu Tyr Ala Ser Phe Arg Leu Glu Pro Glu Ser 

380 385 390 

Glu Tyr Tyr Lys Leu Arg Leu Gly Arg Tyr His Gly Asn Ala Gly 

395 400 405 

Asp Ser Phe Thr Trp His Asn Gly Lys Gin Phe Thr Thr Leu Asp 

410 415 r\ 420 

Arg Asp His Asp Val Tyr Thr Gly Asn Cys Ala His Tyr Gin Lys 

425 430 - 435 

Gly Gly Trp Trp Tyr Asn Ala Cys Ala His Ser Asn Leu Asn Gly 

440 445 : 450 

Val Trp Tyr Arg Gly Gly His Tyr Arg Ser Arg Tyr Gin Asp Gly 

455 ' 460 465 
Val Tyr Trp Ala Glu Phe Arg Gly Gly Ser Tyr Ser Leu Lys Lys 

470 475 • 480 

Val Val Met Met He Arg Pro Asn Pro Asn Thr Phe His 

485 490 



S. J 



i ;- ■-' .- 1 
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